[GitHub] flink pull request: FLINK-2380: allow to specify the default files...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1524#discussion_r53014739 --- Diff: flink-yarn/src/main/scala/org/apache/flink/yarn/ApplicationMasterBase.scala --- @@ -176,6 +184,8 @@ abstract class ApplicationMasterBase { jobManagerPort, webServerPort, slots, taskManagerCount, dynamicPropertiesEncodedString) + //todo should I also set the FS default here --- End diff -- @rmetzger Yes I know. That comment was forgotten since earlier. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3304: Making the Avro Schema serializabl...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1635#issuecomment-184688675 Thanks a lot @rmetzger ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: Makes the task cancellation interv...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1612#issuecomment-184236866 Hello! Just rebased to the new master. Please review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3327: ExecutionConfig to JobGraph.
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1583#issuecomment-184236773 Hello! Just rebased to the new master. Please review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3304: Making the Avro Schema serializabl...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1635#issuecomment-183938932 Thanks a lot @rmetzger --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3304: Making the Avro Schema serializabl...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1635 FLINK-3304: Making the Avro Schema serializable. This solves the issue FLINK-3304 by making the Avro Schema serializable. This is done by having a custom serializer which transforms the Schema into a JSON string, and the deserializer de-serializes the JSON to re-create the original schema. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink avro Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1635.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1635 commit 173bf6a013f78fab2352f23cb7dae9399aa0ba5a Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-02-11T17:24:29Z FLINK-3304: Making the Avro Schema serializable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2380: allow to specify the default files...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1524#issuecomment-183330257 Thanks for the comment @rmetzger. I changed the error message. Please review and let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2380: allow to specify the default files...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1524#issuecomment-182817638 Thanks @rmetzger for the comment. Will fix it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2380: allow to specify the default files...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1524#issuecomment-182951707 I have updated the PR with the new comments. Please review the new PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3327: ExecutionConfig to JobGraph.
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1583#issuecomment-182414285 Please review the new pull request. This pull request is the first step for the one about FLINK-2523. Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: Makes the task cancellation interv...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1612#issuecomment-182413649 Thanks a lot for the comments @rmetzger and @tillrohrmann . I integrated them. Please review the new pull request. This pull request is based upon the one about FLINK-3327. Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: ExecutionConfig to JobGraph.
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1583#issuecomment-181917384 Thanks a lot @StephanEwen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: Makes the task cancellation interv...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1612#discussion_r52388927 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobGraph.java --- @@ -102,60 +101,150 @@ // /** -* Constructs a new job graph with no name and a random job ID. +* Constructs a new job graph with no name, a random job ID, and the default +* {@link ExecutionConfig} parameters. */ public JobGraph() { this((String) null); } /** -* Constructs a new job graph with the given name, a random job ID. +* Constructs a new job graph with the given name, a random job ID and the default +* {@link ExecutionConfig} parameters. * * @param jobName The name of the job */ public JobGraph(String jobName) { - this(null, jobName); + this(null, jobName, new ExecutionConfig()); + } + + /** +* Constructs a new job graph with the given job ID, the given name, and the default +* {@link ExecutionConfig} parameters. +* +* @param jobID The id of the job. A random ID is generated, if {@code null} is passed. +* @param jobName The name of the job. +*/ + public JobGraph(JobID jobID, String jobName) { + this(jobID, jobName, new ExecutionConfig()); } /** -* Constructs a new job graph with the given name and a random job ID if null supplied as an id. +* Constructs a new job graph with the given name, the given {@link ExecutionConfig}, +* and a random job ID. +* +* @param jobName The name of the job. +* @param config The execution configuration of the job. +*/ + public JobGraph(String jobName, ExecutionConfig config) { + this(null, jobName, config); + } + + /** +* Constructs a new job graph with the given job ID (or a random ID, if {@code null} is passed), +* the given name and the given execution configuration (see {@link ExecutionConfig}). * * @param jobId The id of the job. A random ID is generated, if {@code null} is passed. * @param jobName The name of the job. +* @param config The execution configuration of the job. */ - public JobGraph(JobID jobId, String jobName) { + public JobGraph(JobID jobId, String jobName, ExecutionConfig config) { this.jobID = jobId == null ? new JobID() : jobId; this.jobName = jobName == null ? "(unnamed job)" : jobName; + this.executionConfig = config; --- End diff -- It cannot, I just added a check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: Makes the task cancellation interv...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1612#discussion_r52388853 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobGraph.java --- @@ -102,60 +101,150 @@ // /** -* Constructs a new job graph with no name and a random job ID. +* Constructs a new job graph with no name, a random job ID, and the default +* {@link ExecutionConfig} parameters. */ public JobGraph() { this((String) null); } /** -* Constructs a new job graph with the given name, a random job ID. +* Constructs a new job graph with the given name, a random job ID and the default +* {@link ExecutionConfig} parameters. * * @param jobName The name of the job */ public JobGraph(String jobName) { - this(null, jobName); + this(null, jobName, new ExecutionConfig()); + } + + /** +* Constructs a new job graph with the given job ID, the given name, and the default +* {@link ExecutionConfig} parameters. +* +* @param jobID The id of the job. A random ID is generated, if {@code null} is passed. +* @param jobName The name of the job. +*/ + public JobGraph(JobID jobID, String jobName) { + this(jobID, jobName, new ExecutionConfig()); --- End diff -- Same as before. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: Makes the task cancellation interv...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1612#discussion_r52388350 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/deployment/TaskDeploymentDescriptor.java --- @@ -141,9 +146,16 @@ public TaskDeploymentDescriptor( List requiredJarFiles, List requiredClasspaths, int targetSlotNumber) { - this(appId, jobID, vertexID, executionId, taskName, indexInSubtaskGroup, numberOfSubtasks, attemptNumber, - jobConfiguration, taskConfiguration, invokableClassName, producedPartitions, - inputGates, requiredJarFiles, requiredClasspaths, targetSlotNumber, null, -1); + this(appId, jobID, vertexID, executionId, new ExecutionConfig(), taskName, indexInSubtaskGroup, --- End diff -- This constructor only exists for tests. The executionConfig in the TDD is the change of this pull request, so here the new ExecutionConfig is just to not break the already existing tests that were testing other parts of the functionality of the TDD. New tests are added to test the new functionality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: Makes the task cancellation interv...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1612#discussion_r52388830 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobGraph.java --- @@ -102,60 +101,150 @@ // /** -* Constructs a new job graph with no name and a random job ID. +* Constructs a new job graph with no name, a random job ID, and the default +* {@link ExecutionConfig} parameters. */ public JobGraph() { this((String) null); } /** -* Constructs a new job graph with the given name, a random job ID. +* Constructs a new job graph with the given name, a random job ID and the default +* {@link ExecutionConfig} parameters. * * @param jobName The name of the job */ public JobGraph(String jobName) { - this(null, jobName); + this(null, jobName, new ExecutionConfig()); --- End diff -- Again here the constructors that do not specify executionConfigs are only exists for tests. So here the empty ExecutionConfig is just to not break the already existing tests that were testing other parts of the functionality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: Makes the task cancellation interv...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1612 FLINK-2523: Makes the task cancellation interval configurable. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink task_cancellation_interval Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1612.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1612 commit cd36d7ec883e69828bcb476d69aba465dca79b8d Author: Aljoscha Krettek <aljoscha.kret...@gmail.com> Date: 2016-02-03T10:07:38Z [hotfix] Fix typos in Trigger.java commit fdf74f036a96708fae7b8c8a5a2ce041bb7ed20f Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-02-03T12:58:12Z FLINK-3327: Attaches the ExecutionConfig to the JobGraph and propagates it to the Task itself. commit 5ead1c05215f4fd70f55370f7864674d670b1282 Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-02-09T14:48:00Z FLINK-2523: Makes the task cancellation interval configurable through the ExecutionConfig. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2213 Makes the number of vcores per YARN...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1588#issuecomment-180847040 Thanks a lot for the comments @rmetzger and @StephanEwen . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2213 Makes the number of vcores per YARN...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1588#discussion_r52106077 --- Diff: docs/setup/config.md --- @@ -211,6 +211,8 @@ The parameters define the behavior of tasks that create result files. yarn.application-master.env.LD_LIBRARY_PATH: "/usr/lib/native" +- `yarn.containers.vcores` The number of virtual cores (vcores) per YARN container. By default the number of `vcores` is set equal to the maximum between the number of slots per TaskManager, and the number of cores available to the Java runtime. --- End diff -- This was to have a fallback strategy in case the slots parameter is not set. But @StephanEwen 's comment probably solves it. The fallback will be set to the previous strategy where vcores=1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: making the task cancellation inter...
Github user kl0u closed the pull request at: https://github.com/apache/flink/pull/1546 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2213 Makes the number of vcores per YARN...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1588 FLINK-2213 Makes the number of vcores per YARN container configurable. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink vcores_param Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1588.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1588 commit 91d2dc905e5a82b9812dbbe172c9a267eff27ad6 Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-02-04T14:01:58Z Makes the YARN_VCORES configurable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: ExecutionConfig to JobGraph.
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1583 ExecutionConfig to JobGraph. This makes the ExecutionConfig available to the Task. In a nutshell, the ExecutionConfig is attached to the JobGraph which is sent to the JobManager. The JobManager passes it to the ExecutionGraph, and, later on, to the TaskDeploymentDescriptor, and the TaskManager puts it into the Environment, which is visible to the AbstractInvocable. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink config_to_jobgraph Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1583.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1583 commit 58308ddd7be38d9019c77738e4faebf05c92cc12 Author: Aljoscha Krettek <aljoscha.kret...@gmail.com> Date: 2016-02-03T10:07:38Z [hotfix] Fix typos in Trigger.java commit ce3645d3d965a063696af3c56ca9cb0b8be3b36c Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-02-03T12:58:12Z Attaches the ExecutionConfig to the JobGraph and propagates it to the Task itself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2380: allow to specify the default files...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1524#issuecomment-178507931 Could you explain what more tests do you have in mind? So far I am testing 1) if the scheme provided in the configuration is used when one is not explicitly provided, 2) if an explicit scheme overrides the configuration one, and 3) if a scheme from the configuration overrides the default one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3254: Adding functionality to support th...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1568 FLINK-3254: Adding functionality to support the CombineFunction contract. Solves ISSUE-3254: now a function that implements the GroupReduceFunction and the CombineFunction interfaces will be executed with a combiner. Before, this was the case only if the function was implementing the GroupReduceFunction and the GroupCombineFunction interfaces. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink combiner_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1568.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1568 commit 49aabd959472a49f9803023bbec360fec824db75 Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-02-01T08:46:00Z FLINK-3254: Adding functionality to support the CombineFunction contract. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3254: Adding functionality to support th...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1568#discussion_r51427689 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReduceOperator.java --- @@ -156,8 +162,8 @@ public boolean isCombinable() { public GroupReduceOperator<IN, OUT> setCombinable(boolean combinable) { // sanity check that the function is a subclass of the combine interface - if (combinable && !(function instanceof GroupCombineFunction)) { - throw new IllegalArgumentException("The function does not implement the combine interface."); + if (combinable && !(function instanceof GroupCombineFunction || function instanceof CombineFunction)) { --- End diff -- Hi @fhueske, could you explain this a bit more? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3198: Renames and documents the getDataS...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1548#issuecomment-174917008 Thanks a lot @fhueske ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: making the task cancellation inter...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1546#discussion_r50694370 --- Diff: flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java --- @@ -60,6 +60,7 @@ public static void main(String[] args) throws Exception { // set up the execution environment final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + env.getConfig().setTaskCancellationDelay(4); --- End diff -- Yes you are right. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3198: Renames and documents the getDataS...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1548 FLINK-3198: Renames and documents the getDataSet() method in Grouping. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink groupBy_renaming Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1548.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1548 commit 130fc0c27c1582db4b9528e1ee7818de8d48bef3 Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-01-25T15:07:39Z FLINK-3198: Renames and documents better the use of the getDataSet() in Grouping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: making the task cancellation inter...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1546 FLINK-2523: making the task cancellation interval configurable. FLINK-2523: Makes the task cancellation interval configurable. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink task_cancellation Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1546.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1546 commit bfbb45ddeb29b4eb3f366180d1edeecfb2bc06fd Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-01-20T13:05:58Z FLINK-2523: making the task cancellation interval configurable. FLINK-2523: making the task cancellation interval configurable. Added the cancellation delay, although no tests and not sure if it has to be checkpointed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2523: making the task cancellation inter...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/1546#issuecomment-174548688 Hi @StephanEwen , Could you please elaborate more on how you think that the ExecutionConfig could be accessed differently by the Task? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2380: allow to specify the default files...
Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/1524#discussion_r50253291 --- Diff: docs/setup/config.md --- @@ -52,6 +52,14 @@ The configuration files for the TaskManagers can be different, Flink does not as - `parallelism.default`: The default parallelism to use for programs that have no parallelism specified. (DEFAULT: 1). For setups that have no concurrent jobs running, setting this value to NumTaskManagers * NumSlotsPerTaskManager will cause the system to use all available execution resources for the program's execution. **Note**: The default parallelism can be overwriten for an entire job by calling `setParallelism(int parallelism)` on the `ExecutionEnvironment` or by passing `-p ` to the Flink Command-line frontend. It can be overwritten for single transformations by calling `setParallelism(int parallelism)` on an operator. See the [programming guide]({{site.baseurl}}/apis/programming_guide.html#parallel-execution) for more information about the parallelism. +- `fs.default-scheme`: The default filesystem scheme to be used, with the necessary authority (if needed). +By default, this is set to `file:///` which points to the local filesystem. This means that the local --- End diff -- Actually it must be three. The authority in the case of the local filesystem is empty, this is denoted by having nothing between the two first and the third slashes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-2380: allow to specify the default files...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1524 FLINK-2380: allow to specify the default filesystem scheme in the flink configuration file. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink fs_param Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1524.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1524 commit ef6431f586f983c2b0ba9318cc4046c3b348a742 Author: Kostas Kloudas <kklou...@gmail.com> Date: 2016-01-19T15:42:33Z FLINK-2380: allow the specification of a default filesystem scheme in the flink configuration file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Create a deep-copy of the record when changing...
Github user kl0u closed the pull request at: https://github.com/apache/flink/pull/1231 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Create a deep-copy of the record when changing...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/1231 Create a deep-copy of the record when changing timestamps. This is too fix the problem of changing the timestamps in-place, versus creating a deep-copy and changing the timestamp in the new copy of each record in the stream. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink timestamp_deep_copy Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1231.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1231 commit ff766ae3b5bf2073e9ea531353d0bc8f8e3e9ddf Author: Kostas Kloudas <kklou...@gmail.com> Date: 2015-10-06T14:33:48Z Create a deep-copy of the record when changing timestamps. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-136782152 @mxm Sounds good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-136775729 Thanks @mxm. Although I don't think I will have time to fix it right now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-136765583 Hi @mxm. What do you mean? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-136771159 When I did the latest rebase it was saying that the two branches were ready to be merged. Is there a way to see where they have diverged? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-135033380 Hello! I just rebased. Please have a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-131208217 Just rebased with the new version of the master. Please have a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-131209661 No problem! This message was just a reminder. Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-126076384 Hi @mxm , Thanks a lot for the comments! I integrated most of them. Please have a look and let me know what you think. For the merging of the the different types of snapshots and handling them uniformly I do not have any current solution. If you have any, I am open, of course, to discuss it, because I agree that this would be nice. For the comment on the getAccumulatorResultsStringified(): 1) this is to be presented by the web interface to the user, just for monitoring purposes 2) this is called at the jobManager. The problem is that the jobManager has only the blobKeys that point to the stored accumulators. The serialized data reside in the blobCache and have to be fetched in order to be inspected. Currently the jobManager just forwards the blobKeys to the client, which fetches the blobs and does the deserialization and the final merging. This is done for jobManager scalability reasons, as given that we are talking about accumulators of arbitrary size, loading them from disk and deserializing them would be time and resource consuming. The same holds in the case that we wanted to get the type of these large accumulators (it is needed by the method). We would have to load and deserialize them at the jobManager. The currently implemented solution is just the result of this design decision. If you have any other strategy or solution that is worth implementing, let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u closed the pull request at: https://github.com/apache/flink/pull/887 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/934 Framesize fix In Apache Flink the results of the collect() call were returned through akka to the client. This led to an inherent limitation to the size of the output of a job, as this could not exceed the akka.framesize size. In other case, akka would drop the message. To alleviate this, without dropping the benefits brought by akka and its out-of-the-box efficiency for small-sized results, we decided to keep forwarding the non-oversized (i.e. smaller than the akka.framesize) results through akka, and use the BlobCache module for the forwarding the oversized (large) ones. Now the JobManager receives end merges the small accumulators (as before), and simply forwards to the Client the keys to the blobs storing the oversized ones. Now it is the responsibility of the Client to do the final merging between oversized and non-oversized accumulators. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink framesize_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #934 commit fb1fbd6bdcc81acd20d422842789fce0c0872580 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-22T17:53:11Z Solved the #887 issue: removing the akka.framesize size limitation for the result of a job. commit 34d3e433eb0ce976539de166288550c9c7612eb4 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-22T17:53:11Z Solved the #887 issue: removing the akka.framesize size limitation for the result of a job. commit 55aa50c3f3e5c4c3a253b8da68b5ddde9acb307f Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-24T12:02:10Z Merge branch 'framesize_fix' of https://github.com/kl0u/flink into framesize_fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-124500486 FLINK-2319 This pull request targets this ticket. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-124500737 Hello guys, This is a new pull request, for a previous ticket. It is aligned with recent changes in the master branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123752371 Hi @mxm. Thanks a lot! I don't have your email unfortunately. Could you somehow send it to me? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123756855 Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123738197 Hello, The latest changes are pretty invasive and have a big overlap with the ones in my pull request. More specifically, the abstraction of the AccumulatorRegistry changes my implementation a lot. Consequently I have to re-implement much of my previous code. This may take some time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-119140624 Ok, sounds good! Could you give the number of the ticket of the changes @mxm is doing? Just to have a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-118642591 Hello, I have integrated the changes you suggested, so now: 1) notes are no longer in the .gitignore 2) the collect example is not in the created jars, if fact it is no longer in the examples 3) the oversized accumulator test is added in the MiscellaneousIssuesITCase. And also a JIRA ticket is created. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/887 Collect(): Fixing the akka.framesize size limitation. In Apache Flink the results of the collect() call were returned through akka to the client. This led to an inherent limitation to the size of the output of a job, as this could not exceed the akka.framesize size. In other case, akka would drop the message. To alleviate this, without dropping the benefits brought by akka and its out-of-the-box efficiency for small-sized results, we decided to keep forwarding the non-oversized (i.e. smaller than the akka.framesize) results through akka, and use the BlobCache module for the forwarding the oversized (large) ones. Now the JobManager receives end merges the small accumulators (as before), and simply forwards to the Client the keys to the blobs storing the oversized ones. Now it is the responsibility of the Client to do the final merging between oversized and non-oversized accumulators. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink collect_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/887.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #887 commit f417e2585fda1aca936b8e0637618d44cd0b81ca Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T14:50:48Z A first working version of collect() with unbounded Accumulator sizes. commit bf52a091b0fbb04426fa61949334cc44c548d6c2 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T15:31:47Z Cleaned up the TaskManaget side. commit f0de184b0a3aac64bcaa753db0917778e031883e Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T18:54:08Z Cleaned up till the JobManager side. commit 10faf14c4df168da533a35fefa495c1b860ddf1d Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T22:56:09Z Cleaned up the code. Missing the Stringified result. commit 9cd35f46dcf5e6494196185621413ba793da0913 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T00:37:12Z Fixed a version for the Stringified result. commit e5787c74e48a9bed7c503e5d2e90c51b5f33d24f Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T01:45:38Z Fixed a sanity check in the SerializedJobExecutionResult. commit c36bab2c54f1e6a9f401be6eb1e9a75171342212 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T02:35:17Z Fixed the cleaning up of the BlobCache after the end of the job. commit 764f8bda9fbda58d3df7cac51f5b1b2c1cee10de Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T03:41:44Z Fixed a test bug. commit 1c1701a0bd8e4eef742d18494875176136f35233 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T14:52:06Z Fixed a comment in the RuntimeEnvironment. commit 1471bc22bd32675be91c96ec5e0e8ce884fc0bd0 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T15:01:59Z Fixed some method and class renaming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---