[jira] [Commented] (FLINK-7) [GitHub] Enable Range Partitioner
[ https://issues.apache.org/jira/browse/FLINK-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975823#comment-14975823 ] ASF GitHub Bot commented on FLINK-7: Github user ChengXiangLi commented on a diff in the pull request: https://github.com/apache/flink/pull/1255#discussion_r43088529 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1223,6 +1230,51 @@ public long count() throws Exception { final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keyExtractor, getType()); return new PartitionOperator(this, PartitionMethod.HASH, new Keys.SelectorFunctionKeys(clean(keyExtractor), this.getType(), keyType), Utils.getCallLocationName()); } + + /** +* Range-partitions a DataSet using the specified KeySelector. +* +* Important:This operation shuffles the whole DataSet over the network and can take significant amount of time. +* +* @param keySelector The KeySelector with which the DataSet is range-partitioned. +* @return The partitioned DataSet. +* +* @see KeySelector +*/ + public > DataSet partitionByRange(KeySelector keySelector) { + final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keySelector, getType()); + String callLocation = Utils.getCallLocationName(); + + // Extract key from input element by keySelector. + KeyExtractorMapper keyExtractorMapper = new KeyExtractorMapper (keySelector); --- End diff -- Yes, it's very low level job abstraction, not sure whether i can get everything required, i didn't find any precedent of this, but it deserve a try. Besides, everything required(ship strategy type / target parallelism) is available at `OptimizedPlan` level, so i think it should be better to inject the sampling and partitionID assignment code by modification of `OptimizedPlan` at the begining of `JobGraphGenerator::compileJobGraph` instead of the previous inject point as the next comment mentioned. The previous inject point is at the middle stage of building `JobGraph`, and require rewriting of `JobGraph`, even lower level than `OptimizedPlan`. > [GitHub] Enable Range Partitioner > - > > Key: FLINK-7 > URL: https://issues.apache.org/jira/browse/FLINK-7 > Project: Flink > Issue Type: Sub-task > Components: Distributed Runtime >Reporter: GitHub Import >Assignee: Chengxiang Li > Fix For: pre-apache > > > The range partitioner is currently disabled. We need to implement the > following aspects: > 1) Distribution information, if available, must be propagated back together > with the ordering property. > 2) A generic bucket lookup structure (currently specific to PactRecord). > Tests to re-enable after fixing this issue: > - TeraSortITCase > - GlobalSortingITCase > - GlobalSortingMixedOrderITCase > Imported from GitHub > Url: https://github.com/stratosphere/stratosphere/issues/7 > Created by: [StephanEwen|https://github.com/StephanEwen] > Labels: core, enhancement, optimizer, > Milestone: Release 0.4 > Assignee: [fhueske|https://github.com/fhueske] > Created at: Fri Apr 26 13:48:24 CEST 2013 > State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-7] [Runtime] Enable Range Partitioner.
Github user ChengXiangLi commented on a diff in the pull request: https://github.com/apache/flink/pull/1255#discussion_r43088529 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1223,6 +1230,51 @@ public long count() throws Exception { final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keyExtractor, getType()); return new PartitionOperator(this, PartitionMethod.HASH, new Keys.SelectorFunctionKeys(clean(keyExtractor), this.getType(), keyType), Utils.getCallLocationName()); } + + /** +* Range-partitions a DataSet using the specified KeySelector. +* +* Important:This operation shuffles the whole DataSet over the network and can take significant amount of time. +* +* @param keySelector The KeySelector with which the DataSet is range-partitioned. +* @return The partitioned DataSet. +* +* @see KeySelector +*/ + public > DataSet partitionByRange(KeySelector keySelector) { + final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keySelector, getType()); + String callLocation = Utils.getCallLocationName(); + + // Extract key from input element by keySelector. + KeyExtractorMapper keyExtractorMapper = new KeyExtractorMapper (keySelector); --- End diff -- Yes, it's very low level job abstraction, not sure whether i can get everything required, i didn't find any precedent of this, but it deserve a try. Besides, everything required(ship strategy type / target parallelism) is available at `OptimizedPlan` level, so i think it should be better to inject the sampling and partitionID assignment code by modification of `OptimizedPlan` at the begining of `JobGraphGenerator::compileJobGraph` instead of the previous inject point as the next comment mentioned. The previous inject point is at the middle stage of building `JobGraph`, and require rewriting of `JobGraph`, even lower level than `OptimizedPlan`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-2922) Add Queryable Window Operator
[ https://issues.apache.org/jira/browse/FLINK-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-2922: - Labels: requires-design-doc (was: ) > Add Queryable Window Operator > - > > Key: FLINK-2922 > URL: https://issues.apache.org/jira/browse/FLINK-2922 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Labels: requires-design-doc > > The idea is to provide a window operator that allows to query the current > window result at any time without discarding the current result. > For example, a user might have an aggregation window operation with tumbling > windows of 1 hour. Now, at any time they might be interested in the current > aggregated value for the currently in-flight hour window. > The idea is to make the operator a two input operator where normal elements > arrive on input one while queries arrive on input two. The query stream must > be keyed by the same key as the input stream. If an input arrives for a key > the current value for that key is emitted along with the query element so > that the user can map the result to the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2476) Remove unwanted check null of input1 in ConnectedDataStream
[ https://issues.apache.org/jira/browse/FLINK-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-2476. --- Resolution: Fixed > Remove unwanted check null of input1 in ConnectedDataStream > --- > > Key: FLINK-2476 > URL: https://issues.apache.org/jira/browse/FLINK-2476 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 0.8.1 >Reporter: fangfengbin >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2918) Add a method to be able to read SequenceFileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975849#comment-14975849 ] ASF GitHub Bot commented on FLINK-2918: --- Github user smarthi commented on the pull request: https://github.com/apache/flink/pull/1299#issuecomment-151398008 @tillrohrmann There's an existing DataSet#write() api, don't think we need a DataSet#writeAsSequenceFile(). > Add a method to be able to read SequenceFileInputFormat > --- > > Key: FLINK-2918 > URL: https://issues.apache.org/jira/browse/FLINK-2918 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 0.9.1 >Reporter: Suneel Marthi >Assignee: Suneel Marthi >Priority: Minor > Fix For: 0.10 > > > This is to add a method to ExecutionEnvironment.{java,scala} to be able to > provide syntactic sugar to read a SequenceFileInputFormat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2918] Add method to read a file of type...
Github user smarthi commented on the pull request: https://github.com/apache/flink/pull/1299#issuecomment-151398008 @tillrohrmann There's an existing DataSet#write() api, don't think we need a DataSet#writeAsSequenceFile(). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-2883) Combinable reduce produces wrong result
[ https://issues.apache.org/jira/browse/FLINK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-2883: - Issue Type: Task (was: Bug) > Combinable reduce produces wrong result > --- > > Key: FLINK-2883 > URL: https://issues.apache.org/jira/browse/FLINK-2883 > Project: Flink > Issue Type: Task >Affects Versions: 0.10 >Reporter: Till Rohrmann > > If one uses a combinable reduce operation which also changes the key value of > the underlying data element, then the results of the reduce operation can > become wrong. The reason is that after the combine phase, another reduce > operator is executed which will then reduce the elements based on the new key > values. This might be not so surprising if one explicitly defined ones > {{GroupReduceOperation}} as combinable. However, the {{ReduceFunction}} > conceals the fact that a combiner is used implicitly. Furthermore, the API > does not prevent the user from changing the key fields which could solve the > problem. > The following example program illustrates the problem > {code} > val env = ExecutionEnvironment.getExecutionEnvironment > env.setParallelism(1) > val input = env.fromElements((1,2), (1,3), (2,3), (3,3), (3,4)) > val result = input.groupBy(0).reduce{ > (left, right) => > (left._1 + right._1, left._2 + right._2) > } > result.output(new PrintingOutputFormat[Int]()) > env.execute() > {code} > The expected output is > {code} > (2, 5) > (2, 3) > (6, 7) > {code} > However, the actual output is > {code} > (4, 8) > (6, 7) > {code} > I think that the underlying problem is that associativity and commutativity > is not sufficient for a combinable reduce operation. Additionally we also > need to make sure that the key stays the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151402904 @mxm I have made some changes and now the *current* plan resides in the `ContextEnvironment`. This required some changes in `StreamContextEnvironment`, which now wraps the `ContextEnvironment` which was initially created to check the existence of context environment. Furthermore, `ContextEnvironmentFactory` needs to store the last context environment created, which can be used then to access the plan, so it can be executed after the call to interactive program's `main` finishes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975886#comment-14975886 ] ASF GitHub Bot commented on FLINK-2797: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151402904 @mxm I have made some changes and now the *current* plan resides in the `ContextEnvironment`. This required some changes in `StreamContextEnvironment`, which now wraps the `ContextEnvironment` which was initially created to check the existence of context environment. Furthermore, `ContextEnvironmentFactory` needs to store the last context environment created, which can be used then to access the plan, so it can be executed after the call to interactive program's `main` finishes. > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2883) Add documentation to forbid key-modifying ReduceFunction
[ https://issues.apache.org/jira/browse/FLINK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-2883: - Summary: Add documentation to forbid key-modifying ReduceFunction (was: Combinable reduce produces wrong result) > Add documentation to forbid key-modifying ReduceFunction > > > Key: FLINK-2883 > URL: https://issues.apache.org/jira/browse/FLINK-2883 > Project: Flink > Issue Type: Task >Affects Versions: 0.10 >Reporter: Till Rohrmann > > If one uses a combinable reduce operation which also changes the key value of > the underlying data element, then the results of the reduce operation can > become wrong. The reason is that after the combine phase, another reduce > operator is executed which will then reduce the elements based on the new key > values. This might be not so surprising if one explicitly defined ones > {{GroupReduceOperation}} as combinable. However, the {{ReduceFunction}} > conceals the fact that a combiner is used implicitly. Furthermore, the API > does not prevent the user from changing the key fields which could solve the > problem. > The following example program illustrates the problem > {code} > val env = ExecutionEnvironment.getExecutionEnvironment > env.setParallelism(1) > val input = env.fromElements((1,2), (1,3), (2,3), (3,3), (3,4)) > val result = input.groupBy(0).reduce{ > (left, right) => > (left._1 + right._1, left._2 + right._2) > } > result.output(new PrintingOutputFormat[Int]()) > env.execute() > {code} > The expected output is > {code} > (2, 5) > (2, 3) > (6, 7) > {code} > However, the actual output is > {code} > (4, 8) > (6, 7) > {code} > I think that the underlying problem is that associativity and commutativity > is not sufficient for a combinable reduce operation. Additionally we also > need to make sure that the key stays the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2920) Apply JMH on KryoVersusAvroMinibenchmark class.
[ https://issues.apache.org/jira/browse/FLINK-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975997#comment-14975997 ] ASF GitHub Bot commented on FLINK-2920: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1302#issuecomment-151412816 Looks good to merge. Thanks! > Apply JMH on KryoVersusAvroMinibenchmark class. > --- > > Key: FLINK-2920 > URL: https://issues.apache.org/jira/browse/FLINK-2920 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: GaoLun >Assignee: GaoLun >Priority: Minor > Labels: easyfix > > JMH is a Java harness for building, running, and analysing > nano/micro/milli/macro benchmarks.Use JMH to replace the old micro benchmarks > method in order to get much more accurate results.Modify the > KryoVersusAvroMinibenchmark class and move it to flink-benchmark module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2920] [tests] Apply JMH on KryoVersusAv...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1302#issuecomment-151412816 Looks good to merge. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098424 --- Diff: flink-clients/src/main/java/org/apache/flink/client/program/Client.java --- @@ -271,18 +272,21 @@ public JobSubmissionResult runDetached(PackagedProgram prog, int parallelism) } else if (prog.isUsingInteractiveMode()) { LOG.info("Starting program in interactive mode"); - ContextEnvironment.setAsContext(this, prog.getAllLibraries(), prog.getClasspaths(), - prog.getUserCodeClassLoader(), parallelism, false); - + ContextEnvironment.ContextEnvironmentFactory factory = ContextEnvironment.setAsContext(this, + prog.getAllLibraries(), prog.getClasspaths(), prog.getUserCodeClassLoader(), parallelism, false); // invoke here try { prog.invokeInteractiveModeForExecution(); + ContextEnvironment ctx = factory.getLastEnvironment(); + if (ctx == null) { + throw new InvalidProgramException("No execution environment was created."); --- End diff -- It's not strictly necessary. This should return a `JobSubmissionResult` with null job id then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976080#comment-14976080 ] ASF GitHub Bot commented on FLINK-2797: --- Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098424 --- Diff: flink-clients/src/main/java/org/apache/flink/client/program/Client.java --- @@ -271,18 +272,21 @@ public JobSubmissionResult runDetached(PackagedProgram prog, int parallelism) } else if (prog.isUsingInteractiveMode()) { LOG.info("Starting program in interactive mode"); - ContextEnvironment.setAsContext(this, prog.getAllLibraries(), prog.getClasspaths(), - prog.getUserCodeClassLoader(), parallelism, false); - + ContextEnvironment.ContextEnvironmentFactory factory = ContextEnvironment.setAsContext(this, + prog.getAllLibraries(), prog.getClasspaths(), prog.getUserCodeClassLoader(), parallelism, false); // invoke here try { prog.invokeInteractiveModeForExecution(); + ContextEnvironment ctx = factory.getLastEnvironment(); + if (ctx == null) { + throw new InvalidProgramException("No execution environment was created."); --- End diff -- It's not strictly necessary. This should return a `JobSubmissionResult` with null job id then. > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098367 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamContextEnvironment.java --- @@ -75,29 +62,31 @@ public JobExecutionResult execute() throws Exception { @Override public JobExecutionResult execute(String jobName) throws Exception { - JobGraph jobGraph; - if (jobName == null) { - jobGraph = this.getStreamGraph().getJobGraph(); - } else { - jobGraph = this.getStreamGraph().getJobGraph(jobName); - } - - transformations.clear(); - - // attach all necessary jar files to the JobGraph - for (URL file : jars) { --- End diff -- The detached job goes through `Client#getJobGraph(FlinkPlan, List, List)` where the jars and classpaths are added to the job graph. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976079#comment-14976079 ] ASF GitHub Bot commented on FLINK-2797: --- Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098367 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamContextEnvironment.java --- @@ -75,29 +62,31 @@ public JobExecutionResult execute() throws Exception { @Override public JobExecutionResult execute(String jobName) throws Exception { - JobGraph jobGraph; - if (jobName == null) { - jobGraph = this.getStreamGraph().getJobGraph(); - } else { - jobGraph = this.getStreamGraph().getJobGraph(jobName); - } - - transformations.clear(); - - // attach all necessary jar files to the JobGraph - for (URL file : jars) { --- End diff -- The detached job goes through `Client#getJobGraph(FlinkPlan, List, List)` where the jars and classpaths are added to the job graph. > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2909) Gelly Graph Generators
[ https://issues.apache.org/jira/browse/FLINK-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976089#comment-14976089 ] Vasia Kalavri commented on FLINK-2909: -- Thank you [~greghogan]. Then, could you edit the title/description of the issue to reflect the scope? Alternatively, we can keep this as a general umbrella issue and create a subtasks for the specific generators and other utilities we'll need. > Gelly Graph Generators > -- > > Key: FLINK-2909 > URL: https://issues.apache.org/jira/browse/FLINK-2909 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > > Include a selection of graph generators in Gelly. Generated graphs will be > useful for performing scalability, stress, and regression testing as well as > benchmarking and comparing algorithms, for both Flink users and developers. > Generated data is infinitely scalable yet described by a few simple > parameters and can often substitute for user data or sharing large files when > reporting issues. > There are at multiple categories of graphs as documented by > [NetworkX|https://networkx.github.io/documentation/latest/reference/generators.html] > and elsewhere. > Graphs may be a well-defined, i.e. the [Chvátal > graph|https://en.wikipedia.org/wiki/Chv%C3%A1tal_graph]. These may be > sufficiently small to populate locally. > Graphs may be scalable, i.e. complete and star graphs. These should use > Flink's distributed parallelism. > Graphs may be stochastic, i.e. [RMat > graphs|http://snap.stanford.edu/class/cs224w-readings/chakrabarti04rmat.pdf] > . A key consideration is that the graphs should source randomness from a > seedable PRNG and generate the same Graph regardless of parallelism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976087#comment-14976087 ] ASF GitHub Bot commented on FLINK-2797: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151430438 Thanks. Looks much better. Let's make sure we don't break any classloader or jar dependencies during job submission. This can cause annoyances for users and we have to fix it afterwards :) > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151430438 Thanks. Looks much better. Let's make sure we don't break any classloader or jar dependencies during job submission. This can cause annoyances for users and we have to fix it afterwards :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-2922) Add Queryable Window Operator
[ https://issues.apache.org/jira/browse/FLINK-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-2922: Attachment: FLINK-2922.pdf Design Doc > Add Queryable Window Operator > - > > Key: FLINK-2922 > URL: https://issues.apache.org/jira/browse/FLINK-2922 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Labels: requires-design-doc > Attachments: FLINK-2922.pdf > > > The idea is to provide a window operator that allows to query the current > window result at any time without discarding the current result. > For example, a user might have an aggregation window operation with tumbling > windows of 1 hour. Now, at any time they might be interested in the current > aggregated value for the currently in-flight hour window. > The idea is to make the operator a two input operator where normal elements > arrive on input one while queries arrive on input two. The query stream must > be keyed by the same key as the input stream. If an input arrives for a key > the current value for that key is emitted along with the query element so > that the user can map the result to the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-2922) Add Queryable Window Operator
[ https://issues.apache.org/jira/browse/FLINK-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976143#comment-14976143 ] Aljoscha Krettek edited comment on FLINK-2922 at 10/27/15 10:13 AM: I attached a design doc. was (Author: aljoscha): Design Doc > Add Queryable Window Operator > - > > Key: FLINK-2922 > URL: https://issues.apache.org/jira/browse/FLINK-2922 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Labels: requires-design-doc > Attachments: FLINK-2922.pdf > > > The idea is to provide a window operator that allows to query the current > window result at any time without discarding the current result. > For example, a user might have an aggregation window operation with tumbling > windows of 1 hour. Now, at any time they might be interested in the current > aggregated value for the currently in-flight hour window. > The idea is to make the operator a two input operator where normal elements > arrive on input one while queries arrive on input two. The query stream must > be keyed by the same key as the input stream. If an input arrives for a key > the current value for that key is emitted along with the query element so > that the user can map the result to the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098268 --- Diff: flink-clients/src/main/java/org/apache/flink/client/CliFrontend.java --- @@ -909,7 +919,21 @@ private int handleError(Throwable t) { } LOG.error("Error while running the command.", t); - t.printStackTrace(); + // check if the error was due to an invalid program in detached mode. + if (t instanceof ProgramInvocationException && t.getCause() instanceof DetachedProgramException) { + System.err.println(t.getCause().getMessage()); + // now trace to the user's main method. We don't wanna show unnecessary information + // in this particular case. --- End diff -- Why in this case? Doesn't that apply to all errors that occur during interactive execution? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098234 --- Diff: flink-clients/src/main/java/org/apache/flink/client/program/Client.java --- @@ -271,18 +272,21 @@ public JobSubmissionResult runDetached(PackagedProgram prog, int parallelism) } else if (prog.isUsingInteractiveMode()) { LOG.info("Starting program in interactive mode"); - ContextEnvironment.setAsContext(this, prog.getAllLibraries(), prog.getClasspaths(), - prog.getUserCodeClassLoader(), parallelism, false); - + ContextEnvironment.ContextEnvironmentFactory factory = ContextEnvironment.setAsContext(this, + prog.getAllLibraries(), prog.getClasspaths(), prog.getUserCodeClassLoader(), parallelism, false); // invoke here try { prog.invokeInteractiveModeForExecution(); + ContextEnvironment ctx = factory.getLastEnvironment(); + if (ctx == null) { + throw new InvalidProgramException("No execution environment was created."); --- End diff -- Do we really want to fail here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2017) Add predefined required parameters to ParameterTool
[ https://issues.apache.org/jira/browse/FLINK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976122#comment-14976122 ] ASF GitHub Bot commented on FLINK-2017: --- Github user mliesenberg commented on the pull request: https://github.com/apache/flink/pull/1097#issuecomment-151439931 I do see your point about the uselessness of only `check`, so I'll go ahead and implement `applyTo`. I'll update the PR tonight. > Add predefined required parameters to ParameterTool > --- > > Key: FLINK-2017 > URL: https://issues.apache.org/jira/browse/FLINK-2017 > Project: Flink > Issue Type: Improvement >Affects Versions: 0.9 >Reporter: Robert Metzger > Labels: starter > > In FLINK-1525 we've added the {{ParameterTool}}. > During the PR review, there was a request for required parameters. > This issue is about implementing a facility to define required parameters. > The tool should also be able to print a help menu with a list of all > parameters. > This test case shows my initial ideas how to design the API > {code} > @Test > public void requiredParameters() { > RequiredParameters required = new RequiredParameters(); > Option input = required.add("input").alt("i").help("Path to > input file or directory"); // parameter with long and short variant > required.add("output"); // parameter only with long variant > Option parallelism = > required.add("parallelism").alt("p").type(Integer.class); // parameter with > type > Option spOption = > required.add("sourceParallelism").alt("sp").defaultValue(12).help("Number > specifying the number of parallel data source instances"); // parameter with > default value, specifying the type. > Option executionType = > required.add("executionType").alt("et").defaultValue("pipelined").choices("pipelined", > "batch"); > ParameterUtil parameter = ParameterUtil.fromArgs(new > String[]{"-i", "someinput", "--output", "someout", "-p", "15"}); > required.check(parameter); > required.printHelp(); > required.checkAndPopulate(parameter); > String inputString = input.get(); > int par = parallelism.getInteger(); > String output = parameter.get("output"); > int sourcePar = parameter.getInteger(spOption.getName()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2017) Add predefined required parameters to ParameterTool
[ https://issues.apache.org/jira/browse/FLINK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976130#comment-14976130 ] ASF GitHub Bot commented on FLINK-2017: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1097#discussion_r43101847 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/utils/RequiredParameter.java --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.java.utils; + +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +/** + * Facility to manage required parameters in user defined functions. + */ +public class RequiredParameter { + + private static final String HELP_TEXT_PARAM_DELIMITER = "\t"; + private static final String HELP_TEXT_LINE_DELIMITER = "\n"; + + private HashMapdata; + + public RequiredParameter() { + this.data = new HashMap<>(); + } + + public void add(Option option) throws RequiredParameterException { --- End diff -- Sure, it's always good to have my eyes looking over code. Esp. for API issues. > Add predefined required parameters to ParameterTool > --- > > Key: FLINK-2017 > URL: https://issues.apache.org/jira/browse/FLINK-2017 > Project: Flink > Issue Type: Improvement >Affects Versions: 0.9 >Reporter: Robert Metzger > Labels: starter > > In FLINK-1525 we've added the {{ParameterTool}}. > During the PR review, there was a request for required parameters. > This issue is about implementing a facility to define required parameters. > The tool should also be able to print a help menu with a list of all > parameters. > This test case shows my initial ideas how to design the API > {code} > @Test > public void requiredParameters() { > RequiredParameters required = new RequiredParameters(); > Option input = required.add("input").alt("i").help("Path to > input file or directory"); // parameter with long and short variant > required.add("output"); // parameter only with long variant > Option parallelism = > required.add("parallelism").alt("p").type(Integer.class); // parameter with > type > Option spOption = > required.add("sourceParallelism").alt("sp").defaultValue(12).help("Number > specifying the number of parallel data source instances"); // parameter with > default value, specifying the type. > Option executionType = > required.add("executionType").alt("et").defaultValue("pipelined").choices("pipelined", > "batch"); > ParameterUtil parameter = ParameterUtil.fromArgs(new > String[]{"-i", "someinput", "--output", "someout", "-p", "15"}); > required.check(parameter); > required.printHelp(); > required.checkAndPopulate(parameter); > String inputString = input.get(); > int par = parallelism.getInteger(); > String output = parameter.get("output"); > int sourcePar = parameter.getInteger(spOption.getName()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2017] Add predefined required parameter...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1097#discussion_r43101847 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/utils/RequiredParameter.java --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.java.utils; + +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +/** + * Facility to manage required parameters in user defined functions. + */ +public class RequiredParameter { + + private static final String HELP_TEXT_PARAM_DELIMITER = "\t"; + private static final String HELP_TEXT_LINE_DELIMITER = "\n"; + + private HashMapdata; + + public RequiredParameter() { + this.data = new HashMap<>(); + } + + public void add(Option option) throws RequiredParameterException { --- End diff -- Sure, it's always good to have my eyes looking over code. Esp. for API issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2918) Add a method to be able to read SequenceFileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976149#comment-14976149 ] ASF GitHub Bot commented on FLINK-2918: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1299#issuecomment-151442975 Hi @smarthi, I am not super familiar with Hadoop interface, but is the `hadoop2` switch really necessary? Hadoop2 does also support the `mapred.*` interfaces, right? Or are both `SequenceInputFormat`s and `SequenceOutputFormat`s not compatible? Thanks, Fabian > Add a method to be able to read SequenceFileInputFormat > --- > > Key: FLINK-2918 > URL: https://issues.apache.org/jira/browse/FLINK-2918 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 0.9.1 >Reporter: Suneel Marthi >Assignee: Suneel Marthi >Priority: Minor > Fix For: 0.10 > > > This is to add a method to ExecutionEnvironment.{java,scala} to be able to > provide syntactic sugar to read a SequenceFileInputFormat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2918] Add method to read a file of type...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1299#issuecomment-151442975 Hi @smarthi, I am not super familiar with Hadoop interface, but is the `hadoop2` switch really necessary? Hadoop2 does also support the `mapred.*` interfaces, right? Or are both `SequenceInputFormat`s and `SequenceOutputFormat`s not compatible? Thanks, Fabian --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2662) CompilerException: "Bug: Plan generation for Unions picked a ship strategy between binary plan operators."
[ https://issues.apache.org/jira/browse/FLINK-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976217#comment-14976217 ] Chesnay Schepler commented on FLINK-2662: - I tried reproducing the error but the shipping strategy is picked correctly on my machine. > CompilerException: "Bug: Plan generation for Unions picked a ship strategy > between binary plan operators." > -- > > Key: FLINK-2662 > URL: https://issues.apache.org/jira/browse/FLINK-2662 > Project: Flink > Issue Type: Bug > Components: Optimizer >Affects Versions: 0.9.1, 0.10 >Reporter: Gabor Gevay > Fix For: 0.10 > > > I have a Flink program which throws the exception in the jira title. Full > text: > Exception in thread "main" org.apache.flink.optimizer.CompilerException: Bug: > Plan generation for Unions picked a ship strategy between binary plan > operators. > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.collect(BinaryUnionReplacer.java:113) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:72) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:41) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:170) > at > org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163) > at > org.apache.flink.optimizer.plan.WorksetIterationPlanNode.acceptForStepFunction(WorksetIterationPlanNode.java:194) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:49) > at > org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:41) > at > org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:162) > at > org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) > at > org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) > at > org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127) > at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:520) > at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:402) > at > org.apache.flink.client.LocalExecutor.getOptimizerPlanAsJSON(LocalExecutor.java:202) > at > org.apache.flink.api.java.LocalEnvironment.getExecutionPlan(LocalEnvironment.java:63) > at malom.Solver.main(Solver.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) > The execution plan: > http://compalg.inf.elte.hu/~ggevay/flink/plan_3_4_0_0_without_verif.txt > (I obtained this by commenting out the line that throws the exception) > The code is here: > https://github.com/ggevay/flink/tree/plan-generation-bug > The class to run is "Solver". It needs a command line argument, which is a > directory where it would write output. (On first run, it generates some > lookuptables for a few minutes, which are then placed to /tmp/movegen) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976071#comment-14976071 ] ASF GitHub Bot commented on FLINK-2797: --- Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098099 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamContextEnvironment.java --- @@ -75,29 +62,31 @@ public JobExecutionResult execute() throws Exception { @Override public JobExecutionResult execute(String jobName) throws Exception { - JobGraph jobGraph; - if (jobName == null) { - jobGraph = this.getStreamGraph().getJobGraph(); - } else { - jobGraph = this.getStreamGraph().getJobGraph(jobName); - } - - transformations.clear(); - - // attach all necessary jar files to the JobGraph - for (URL file : jars) { --- End diff -- No Jars for detached jobs? > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098099 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamContextEnvironment.java --- @@ -75,29 +62,31 @@ public JobExecutionResult execute() throws Exception { @Override public JobExecutionResult execute(String jobName) throws Exception { - JobGraph jobGraph; - if (jobName == null) { - jobGraph = this.getStreamGraph().getJobGraph(); - } else { - jobGraph = this.getStreamGraph().getJobGraph(jobName); - } - - transformations.clear(); - - // attach all necessary jar files to the JobGraph - for (URL file : jars) { --- End diff -- No Jars for detached jobs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098098 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamContextEnvironment.java --- @@ -75,29 +62,31 @@ public JobExecutionResult execute() throws Exception { @Override public JobExecutionResult execute(String jobName) throws Exception { - JobGraph jobGraph; - if (jobName == null) { - jobGraph = this.getStreamGraph().getJobGraph(); - } else { - jobGraph = this.getStreamGraph().getJobGraph(jobName); - } - - transformations.clear(); - - // attach all necessary jar files to the JobGraph - for (URL file : jars) { - jobGraph.addJar(new Path(file.toURI())); - } - - jobGraph.setClasspaths(classpaths); --- End diff -- No classpaths for detached jobs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976070#comment-14976070 ] ASF GitHub Bot commented on FLINK-2797: --- Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098098 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamContextEnvironment.java --- @@ -75,29 +62,31 @@ public JobExecutionResult execute() throws Exception { @Override public JobExecutionResult execute(String jobName) throws Exception { - JobGraph jobGraph; - if (jobName == null) { - jobGraph = this.getStreamGraph().getJobGraph(); - } else { - jobGraph = this.getStreamGraph().getJobGraph(jobName); - } - - transformations.clear(); - - // attach all necessary jar files to the JobGraph - for (URL file : jars) { - jobGraph.addJar(new Path(file.toURI())); - } - - jobGraph.setClasspaths(classpaths); --- End diff -- No classpaths for detached jobs? > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151435402 Agreed. I'll build and test with the example programs in standalone mode and on yarn. It should work perfectly though since `YarnSessionFIFOITCase` checks both batch and streaming jobs on detached clusters, and the standalone mode is tested in `ClientTest#testDetached`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976104#comment-14976104 ] ASF GitHub Bot commented on FLINK-2797: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151435402 Agreed. I'll build and test with the example programs in standalone mode and on yarn. It should work perfectly though since `YarnSessionFIFOITCase` checks both batch and streaming jobs on detached clusters, and the standalone mode is tested in `ClientTest#testDetached`. > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2907) Bloom filter for Join
[ https://issues.apache.org/jira/browse/FLINK-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976126#comment-14976126 ] Fabian Hueske commented on FLINK-2907: -- Hi Greg, this is a really interesting idea. Being able to filter out data before the shuffle can save a lot of resources and time. I guess, you saw the bloom filter implementation in the current hash join. This is of course only a local bloom filter that helps to reduce the number of probe side records that are spilled to disk if the hash table does not fit into the available memory. I see mainly two questions for this issue: # How can the proposed bloom filter be integrated? I see two ways to implement this feature: ## Light-weight integration at user API level. This would mean, the bloom filter creation, merging, and filtering are done in user code. The drawbacks are that managed memory is not available at this level and we will face serialization overhead because some operations cannot be chained. However, this solution should be rather easy to implement. ## New runtime operators for bloom filters. This would mean to implement three additional drivers to operator on bloom filters: build local bloom filter, merge bloom filters, and filter by bloom filter. These operations would need to be integrated into the optimizer as well. You can have a look at the recent addition of outer joins to get an idea of what this would mean. Due to some restrictions for chaining (only non-branching flows can be chained), the bloom filter building would not be chained. I think the filtering should be chainable. A challenge for this approach is as well that we cannot give guarantees about the available memory budget right now. Memory grants are just relative shares of the total memory. So either we have to think about strategies to work with less memory or add functionality to ensure we have enough memory. # How should this feature be exposed to users? If we make bloom filters part of the API, we can use them also for any kind of filtering. We could do something like: {code} DataSet first = ... DataSet second = ... BloomFilter bf = first.buildBloomFilter("f0"); // "f0" is of type Z DataSet filteredSecond = second.applyBloomFilter(bf, "f1"); // "f1" is of type Z first.join(filteredSecond).where("f0").equalTo("f1") ... {code} I am not sure if this operation is common enough to be a first class member of the DataSet API (i.e., part of {{DataSet}}) or if it should to into the {{DataSetUtils}} class. > Bloom filter for Join > - > > Key: FLINK-2907 > URL: https://issues.apache.org/jira/browse/FLINK-2907 > Project: Flink > Issue Type: New Feature > Components: Java API, Scala API >Affects Versions: 1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Labels: requires-design-doc > > A bloom filter can be a chainable operation for probe side Join elements. An > element not matched by the bloom filter will not be serialized, shipped, > deserialized, and processed. > Generating the bloom filter is a chainable operation over hash side elements. > The bloom filter created on each TaskManager must be the same size to allow > combining by xor. The most efficient means to distribute the bloom filter is > to assign each TaskManager an equal partition that it will receive from all > other TaskManagers. This will be broadcast once all local elements (by > hashing) and remote partitions (by xor) have been processed into that part of > the bloom filter. > An example with numbers: triangle listing/counting joining 2B edges on 149B > two-paths resulting in 21B triangles (this is using the optimal algorithm). > At 8 bits per element the bloom filter will have a false-positive rate of ~2% > and require a 2 GB bloom filter (stored once and shared per TaskManager). > Each TaskManager both sends and receives data equivalent to the size of the > bloom filter (minus the local partition, the size of which trends towards > zero as the number of TaskManagers increases). The number of matched elements > is 21B (true positive) + ~0.02*(149B-21B) = 23.5B, a reduction of 84% or 1.5 > TB (at 12 bytes per element). With 4 TaskManagers only 12 GB of bloom filter > would be transmitted, a savings of 99.2%. > Key issues are determining the size of the bloom filter (dependent on the > count of hash side elements, the available memory segments, and the error > rate) and whether this can be integrated with Join or must be a separate > operator. This also depends on dynamic memory allocation as spilling to disk > would perform the serialization, write, read, and deserialization we are > looking to avoid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2907) Bloom filter for Join
[ https://issues.apache.org/jira/browse/FLINK-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-2907: - Labels: requires-design-doc (was: ) > Bloom filter for Join > - > > Key: FLINK-2907 > URL: https://issues.apache.org/jira/browse/FLINK-2907 > Project: Flink > Issue Type: New Feature > Components: Java API, Scala API >Affects Versions: 1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Labels: requires-design-doc > > A bloom filter can be a chainable operation for probe side Join elements. An > element not matched by the bloom filter will not be serialized, shipped, > deserialized, and processed. > Generating the bloom filter is a chainable operation over hash side elements. > The bloom filter created on each TaskManager must be the same size to allow > combining by xor. The most efficient means to distribute the bloom filter is > to assign each TaskManager an equal partition that it will receive from all > other TaskManagers. This will be broadcast once all local elements (by > hashing) and remote partitions (by xor) have been processed into that part of > the bloom filter. > An example with numbers: triangle listing/counting joining 2B edges on 149B > two-paths resulting in 21B triangles (this is using the optimal algorithm). > At 8 bits per element the bloom filter will have a false-positive rate of ~2% > and require a 2 GB bloom filter (stored once and shared per TaskManager). > Each TaskManager both sends and receives data equivalent to the size of the > bloom filter (minus the local partition, the size of which trends towards > zero as the number of TaskManagers increases). The number of matched elements > is 21B (true positive) + ~0.02*(149B-21B) = 23.5B, a reduction of 84% or 1.5 > TB (at 12 bytes per element). With 4 TaskManagers only 12 GB of bloom filter > would be transmitted, a savings of 99.2%. > Key issues are determining the size of the bloom filter (dependent on the > count of hash side elements, the available memory segments, and the error > rate) and whether this can be integrated with Join or must be a separate > operator. This also depends on dynamic memory allocation as spilling to disk > would perform the serialization, write, read, and deserialization we are > looking to avoid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976077#comment-14976077 ] ASF GitHub Bot commented on FLINK-2797: --- Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098234 --- Diff: flink-clients/src/main/java/org/apache/flink/client/program/Client.java --- @@ -271,18 +272,21 @@ public JobSubmissionResult runDetached(PackagedProgram prog, int parallelism) } else if (prog.isUsingInteractiveMode()) { LOG.info("Starting program in interactive mode"); - ContextEnvironment.setAsContext(this, prog.getAllLibraries(), prog.getClasspaths(), - prog.getUserCodeClassLoader(), parallelism, false); - + ContextEnvironment.ContextEnvironmentFactory factory = ContextEnvironment.setAsContext(this, + prog.getAllLibraries(), prog.getClasspaths(), prog.getUserCodeClassLoader(), parallelism, false); // invoke here try { prog.invokeInteractiveModeForExecution(); + ContextEnvironment ctx = factory.getLastEnvironment(); + if (ctx == null) { + throw new InvalidProgramException("No execution environment was created."); --- End diff -- Do we really want to fail here? > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976078#comment-14976078 ] ASF GitHub Bot commented on FLINK-2797: --- Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098268 --- Diff: flink-clients/src/main/java/org/apache/flink/client/CliFrontend.java --- @@ -909,7 +919,21 @@ private int handleError(Throwable t) { } LOG.error("Error while running the command.", t); - t.printStackTrace(); + // check if the error was due to an invalid program in detached mode. + if (t instanceof ProgramInvocationException && t.getCause() instanceof DetachedProgramException) { + System.err.println(t.getCause().getMessage()); + // now trace to the user's main method. We don't wanna show unnecessary information + // in this particular case. --- End diff -- Why in this case? Doesn't that apply to all errors that occur during interactive execution? > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976084#comment-14976084 ] ASF GitHub Bot commented on FLINK-2797: --- Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098720 --- Diff: flink-clients/src/main/java/org/apache/flink/client/CliFrontend.java --- @@ -909,7 +919,21 @@ private int handleError(Throwable t) { } LOG.error("Error while running the command.", t); - t.printStackTrace(); + // check if the error was due to an invalid program in detached mode. + if (t instanceof ProgramInvocationException && t.getCause() instanceof DetachedProgramException) { + System.err.println(t.getCause().getMessage()); + // now trace to the user's main method. We don't wanna show unnecessary information + // in this particular case. --- End diff -- True. I didn't wanna change the existing stack traces. This will require maintaining a global variable for whether the program was run in interactive mode. The `DetachedProgramException` is a special case where we don't wanna clutter the stack trace with flink's trace, instead just show the user program's stack trace. In other cases, this might not be the case. > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1214#discussion_r43098720 --- Diff: flink-clients/src/main/java/org/apache/flink/client/CliFrontend.java --- @@ -909,7 +919,21 @@ private int handleError(Throwable t) { } LOG.error("Error while running the command.", t); - t.printStackTrace(); + // check if the error was due to an invalid program in detached mode. + if (t instanceof ProgramInvocationException && t.getCause() instanceof DetachedProgramException) { + System.err.println(t.getCause().getMessage()); + // now trace to the user's main method. We don't wanna show unnecessary information + // in this particular case. --- End diff -- True. I didn't wanna change the existing stack traces. This will require maintaining a global variable for whether the program was run in interactive mode. The `DetachedProgramException` is a special case where we don't wanna clutter the stack trace with flink's trace, instead just show the user program's stack trace. In other cases, this might not be the case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2017] Add predefined required parameter...
Github user mliesenberg commented on the pull request: https://github.com/apache/flink/pull/1097#issuecomment-151439931 I do see your point about the uselessness of only `check`, so I'll go ahead and implement `applyTo`. I'll update the PR tonight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2017) Add predefined required parameters to ParameterTool
[ https://issues.apache.org/jira/browse/FLINK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976116#comment-14976116 ] ASF GitHub Bot commented on FLINK-2017: --- Github user mliesenberg commented on a diff in the pull request: https://github.com/apache/flink/pull/1097#discussion_r43100940 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/utils/RequiredParameter.java --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.java.utils; + +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +/** + * Facility to manage required parameters in user defined functions. + */ +public class RequiredParameter { + + private static final String HELP_TEXT_PARAM_DELIMITER = "\t"; + private static final String HELP_TEXT_LINE_DELIMITER = "\n"; + + private HashMapdata; + + public RequiredParameter() { + this.data = new HashMap<>(); + } + + public void add(Option option) throws RequiredParameterException { --- End diff -- `req.add("name").type(...).values(...)` oh, somehow I managed to miss that one. I will add a String based version, maybe we can get a third opinion on the removal? I'd be in favor of giving the user the option to directly add an `Option` as well. > Add predefined required parameters to ParameterTool > --- > > Key: FLINK-2017 > URL: https://issues.apache.org/jira/browse/FLINK-2017 > Project: Flink > Issue Type: Improvement >Affects Versions: 0.9 >Reporter: Robert Metzger > Labels: starter > > In FLINK-1525 we've added the {{ParameterTool}}. > During the PR review, there was a request for required parameters. > This issue is about implementing a facility to define required parameters. > The tool should also be able to print a help menu with a list of all > parameters. > This test case shows my initial ideas how to design the API > {code} > @Test > public void requiredParameters() { > RequiredParameters required = new RequiredParameters(); > Option input = required.add("input").alt("i").help("Path to > input file or directory"); // parameter with long and short variant > required.add("output"); // parameter only with long variant > Option parallelism = > required.add("parallelism").alt("p").type(Integer.class); // parameter with > type > Option spOption = > required.add("sourceParallelism").alt("sp").defaultValue(12).help("Number > specifying the number of parallel data source instances"); // parameter with > default value, specifying the type. > Option executionType = > required.add("executionType").alt("et").defaultValue("pipelined").choices("pipelined", > "batch"); > ParameterUtil parameter = ParameterUtil.fromArgs(new > String[]{"-i", "someinput", "--output", "someout", "-p", "15"}); > required.check(parameter); > required.printHelp(); > required.checkAndPopulate(parameter); > String inputString = input.get(); > int par = parallelism.getInteger(); > String output = parameter.get("output"); > int sourcePar = parameter.getInteger(spOption.getName()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2017] Add predefined required parameter...
Github user mliesenberg commented on a diff in the pull request: https://github.com/apache/flink/pull/1097#discussion_r43100940 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/utils/RequiredParameter.java --- @@ -0,0 +1,149 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.java.utils; + +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +/** + * Facility to manage required parameters in user defined functions. + */ +public class RequiredParameter { + + private static final String HELP_TEXT_PARAM_DELIMITER = "\t"; + private static final String HELP_TEXT_LINE_DELIMITER = "\n"; + + private HashMapdata; + + public RequiredParameter() { + this.data = new HashMap<>(); + } + + public void add(Option option) throws RequiredParameterException { --- End diff -- `req.add("name").type(...).values(...)` oh, somehow I managed to miss that one. I will add a String based version, maybe we can get a third opinion on the removal? I'd be in favor of giving the user the option to directly add an `Option` as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2017] Add predefined required parameter...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1097#issuecomment-151441146 Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2017) Add predefined required parameters to ParameterTool
[ https://issues.apache.org/jira/browse/FLINK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976131#comment-14976131 ] ASF GitHub Bot commented on FLINK-2017: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1097#issuecomment-151441146 Thanks! > Add predefined required parameters to ParameterTool > --- > > Key: FLINK-2017 > URL: https://issues.apache.org/jira/browse/FLINK-2017 > Project: Flink > Issue Type: Improvement >Affects Versions: 0.9 >Reporter: Robert Metzger > Labels: starter > > In FLINK-1525 we've added the {{ParameterTool}}. > During the PR review, there was a request for required parameters. > This issue is about implementing a facility to define required parameters. > The tool should also be able to print a help menu with a list of all > parameters. > This test case shows my initial ideas how to design the API > {code} > @Test > public void requiredParameters() { > RequiredParameters required = new RequiredParameters(); > Option input = required.add("input").alt("i").help("Path to > input file or directory"); // parameter with long and short variant > required.add("output"); // parameter only with long variant > Option parallelism = > required.add("parallelism").alt("p").type(Integer.class); // parameter with > type > Option spOption = > required.add("sourceParallelism").alt("sp").defaultValue(12).help("Number > specifying the number of parallel data source instances"); // parameter with > default value, specifying the type. > Option executionType = > required.add("executionType").alt("et").defaultValue("pipelined").choices("pipelined", > "batch"); > ParameterUtil parameter = ParameterUtil.fromArgs(new > String[]{"-i", "someinput", "--output", "someout", "-p", "15"}); > required.check(parameter); > required.printHelp(); > required.checkAndPopulate(parameter); > String inputString = input.get(); > int par = parallelism.getInteger(); > String output = parameter.get("output"); > int sourcePar = parameter.getInteger(spOption.getName()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2926) Add a Strongly Connected Components Library Method
Andra Lungu created FLINK-2926: -- Summary: Add a Strongly Connected Components Library Method Key: FLINK-2926 URL: https://issues.apache.org/jira/browse/FLINK-2926 Project: Flink Issue Type: Task Components: Gelly Affects Versions: 0.10 Reporter: Andra Lungu Priority: Minor This algorithm operates in four main steps: 1). Form the transposed graph (each vertex sends its id to its out neighbors which form a transposedNeighbors set) 2). Trimming: every vertex which has only incoming or outgoing edges sets colorID to its own value and becomes inactive. 3). Forward traversal: Start phase: propagate id to out neighbors Rest phase: update the colorID with the minimum value seen until convergence 4). Backward traversal: Start: if the vertex id is equal to its color id propagate the value to transposedNeighbors Rest: each vertex that receives a message equal to its colorId will propagate its colorId to the transposed graph and becomes inactive. More info in section 3.1 of this paper: http://ilpubs.stanford.edu:8090/1077/3/p535-salihoglu.pdf or in section 6 of this paper: http://www.vldb.org/pvldb/vol7/p1821-yan.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2797) CLI: Missing option to submit jobs in detached mode
[ https://issues.apache.org/jira/browse/FLINK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976226#comment-14976226 ] ASF GitHub Bot commented on FLINK-2797: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151458018 @mxm , I have tested the build with the following commands: 1. Standalone cluster: `bin/flink run -d` for both streaming and batch wordcount examples. 2. Yarn: `bin/flink run -d -m yarn-cluster` and `bin/flink run -m yarn-cluster -yd` for both streaming and batch wordcount. Lemme know if there's any other checks to be made. Also, what should I do about https://github.com/apache/flink/pull/1214#discussion_r43098424? Let it be or return a null job id? > CLI: Missing option to submit jobs in detached mode > --- > > Key: FLINK-2797 > URL: https://issues.apache.org/jira/browse/FLINK-2797 > Project: Flink > Issue Type: Bug > Components: Command-line client >Affects Versions: 0.9, 0.10 >Reporter: Maximilian Michels >Assignee: Sachin Goel > Fix For: 0.10 > > > Jobs can only be submitted in detached mode using YARN but not on a > standalone installation. This has been requested by users who want to submit > a job, get the job id, and later query its status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2797][cli] Add support for running jobs...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1214#issuecomment-151458018 @mxm , I have tested the build with the following commands: 1. Standalone cluster: `bin/flink run -d` for both streaming and batch wordcount examples. 2. Yarn: `bin/flink run -d -m yarn-cluster` and `bin/flink run -m yarn-cluster -yd` for both streaming and batch wordcount. Lemme know if there's any other checks to be made. Also, what should I do about https://github.com/apache/flink/pull/1214#discussion_r43098424? Let it be or return a null job id? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Comment Edited] (FLINK-2905) Add intersect method to Graph class
[ https://issues.apache.org/jira/browse/FLINK-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976525#comment-14976525 ] Martin Junghanns edited comment on FLINK-2905 at 10/27/15 3:05 PM: --- Sorry for the confusion. In the example, I focused on the graphs that are induced by the vertex intersection (1 and 3 in that case). You are right, in call 1, resulting edges are the union of all input edges with identical source, target and edge values. I think, computing distinct edges would be too strict for some cases. Consider a real world example where you want to intersect two transportation networks between cities, one for trains and one for planes. Each edge has a maximum transportation capacity. There can be a case, where you have two identical edge values in both networks between the same cities. If we compute distinct, it is not possible to aggregate edges by their value (to get the total capacity). Keeping edges from both graphs allows us to compute such aggregates. If the user is not interested in duplicates, a distinct call is always possible and does not invalidate the graph. was (Author: mju): Sorry for the confusion. In the example, I focused on the graphs that are induced by the vertex intersection (1 and 3 in that case). You are right, in call 1, resulting edges are the union of all input edges with identical source target and edge values. I think, computing distinct edges would be too strict for some cases. Consider a real world example where you want to intersect two transportation networks between cities, one for trains and one for planes. Each edge has a maximum transportation capacity. There can be a case, where you have two identical edge values in both networks between the same cities. If we compute distinct, it is not possible to aggregate edges by their value (to get the total capacity). Keeping edges from both graphs allows us to compute such aggregates. If the user is not interested in duplicates, a distinct call is always possible and does not invalidate the graph. > Add intersect method to Graph class > --- > > Key: FLINK-2905 > URL: https://issues.apache.org/jira/browse/FLINK-2905 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 0.10 >Reporter: Martin Junghanns >Assignee: Martin Junghanns >Priority: Minor > > Currently, the Gelly Graph supports the set operations > {{Graph.union(otherGraph)}} and {{Graph.difference(otherGraph)}}. It would be > nice to have a {{Graph.intersect(otherGraph)}} method, where the resulting > graph contains all vertices and edges contained in both input graphs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2927) Provide default required configuration keys in flink-conf of binary distribution
Ufuk Celebi created FLINK-2927: -- Summary: Provide default required configuration keys in flink-conf of binary distribution Key: FLINK-2927 URL: https://issues.apache.org/jira/browse/FLINK-2927 Project: Flink Issue Type: Improvement Affects Versions: 0.10 Reporter: Ufuk Celebi Assignee: Ufuk Celebi Priority: Minor The configuration only contains a template with a subset of the required configuration keys for HA configuration. Add all arguments to make it easy to figure out how to configure it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2905) Add intersect method to Graph class
[ https://issues.apache.org/jira/browse/FLINK-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976558#comment-14976558 ] Vasia Kalavri commented on FLINK-2905: -- This is still not clear to me. Correct me if I'm wrong, but the intersection of 2 graphs is a new graph, which maintains one edge for each edge that is common in the 2 input graphs, i.e. intersection should not have duplicate edges. The example with the transportation networks should work with either a union + aggregation or a join the edge sets on. A solution would be to have the {{intersect}} method look at the IDs only and receive a UDF which can be applied on the common edge values. This way, the output graph won't have any duplicates and you can do whatever you want with the common edges values. Does this make sense? > Add intersect method to Graph class > --- > > Key: FLINK-2905 > URL: https://issues.apache.org/jira/browse/FLINK-2905 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 0.10 >Reporter: Martin Junghanns >Assignee: Martin Junghanns >Priority: Minor > > Currently, the Gelly Graph supports the set operations > {{Graph.union(otherGraph)}} and {{Graph.difference(otherGraph)}}. It would be > nice to have a {{Graph.intersect(otherGraph)}} method, where the resulting > graph contains all vertices and edges contained in both input graphs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2928) Confusing job status visualisation in web frontend
Ufuk Celebi created FLINK-2928: -- Summary: Confusing job status visualisation in web frontend Key: FLINK-2928 URL: https://issues.apache.org/jira/browse/FLINK-2928 Project: Flink Issue Type: Improvement Components: Webfrontend Affects Versions: 0.10 Reporter: Ufuk Celebi Priority: Minor The web frontend displays the job status in very subtle way as a colored circle next to the job name. For single tasks, the state is written out in addition to the color coding (e.g. FAILED with a red background). I would like to add this for the job status as well. It can be confusing during restarts of a job to have single tasks marked as "FAILED" w/o seeing easily what the job status is (RESTARTING). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2905) Add intersect method to Graph class
[ https://issues.apache.org/jira/browse/FLINK-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976525#comment-14976525 ] Martin Junghanns commented on FLINK-2905: - Sorry for the confusion. In the example, I focused on the graphs that are induced by the vertex intersection (1 and 3 in that case). You are right, in call 1, resulting edges are the union of all input edges with identical source target and edge values. I think, computing distinct edges would be too strict for some cases. Consider a real world example where you want to intersect two transportation networks between cities, one for trains and one for planes. Each edge has a maximum transportation capacity. There can be a case, where you have two identical edge values in both networks between the same cities. If we compute distinct, it is not possible to aggregate edges by their value (to get the total capacity). Keeping edges from both graphs allows us to compute such aggregates. If the user is not interested in duplicates, a distinct call is always possible and does not invalidate the graph. > Add intersect method to Graph class > --- > > Key: FLINK-2905 > URL: https://issues.apache.org/jira/browse/FLINK-2905 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 0.10 >Reporter: Martin Junghanns >Assignee: Martin Junghanns >Priority: Minor > > Currently, the Gelly Graph supports the set operations > {{Graph.union(otherGraph)}} and {{Graph.difference(otherGraph)}}. It would be > nice to have a {{Graph.intersect(otherGraph)}} method, where the resulting > graph contains all vertices and edges contained in both input graphs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Add org.apache.httpcomponents:(httpcore, httpc...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1301#issuecomment-151559475 @mxm, Can you look at this for the next RC? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2927) Provide default required configuration keys in flink-conf of binary distribution
[ https://issues.apache.org/jira/browse/FLINK-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976693#comment-14976693 ] ASF GitHub Bot commented on FLINK-2927: --- GitHub user uce opened a pull request: https://github.com/apache/flink/pull/1303 [FLINK-2927] [runtime] Provide default required configuration keys in flink-conf of binary distribution @mxm, can you include this in the next RC? You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink ha-config Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1303.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1303 commit dc750e270932a57ce728e5d2ae043b7476685909 Author: Ufuk CelebiDate: 2015-10-27T16:40:03Z [FLINK-2927] [runtime] Provide default required configuration keys in flink-conf of binary distribution > Provide default required configuration keys in flink-conf of binary > distribution > > > Key: FLINK-2927 > URL: https://issues.apache.org/jira/browse/FLINK-2927 > Project: Flink > Issue Type: Improvement >Affects Versions: 0.10 >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi >Priority: Minor > > The configuration only contains a template with a subset of the required > configuration keys for HA configuration. Add all arguments to make it easy to > figure out how to configure it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2927] [runtime] Provide default require...
GitHub user uce opened a pull request: https://github.com/apache/flink/pull/1303 [FLINK-2927] [runtime] Provide default required configuration keys in flink-conf of binary distribution @mxm, can you include this in the next RC? You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink ha-config Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1303.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1303 commit dc750e270932a57ce728e5d2ae043b7476685909 Author: Ufuk CelebiDate: 2015-10-27T16:40:03Z [FLINK-2927] [runtime] Provide default required configuration keys in flink-conf of binary distribution --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153270 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -327,7 +327,7 @@ void setInDegree(long inDegree) { /** * Retrieve the vertex out-degree (number of out-going edges). -* @return The out-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The out-degree of this vertex --- End diff -- the `if` part is missing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976751#comment-14976751 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153252 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -314,7 +314,7 @@ public void remove() { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- the `if` part is missing > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976753#comment-14976753 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153270 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -327,7 +327,7 @@ void setInDegree(long inDegree) { /** * Retrieve the vertex out-degree (number of out-going edges). -* @return The out-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The out-degree of this vertex --- End diff -- the `if` part is missing > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2930] Respect ExecutionConfig execution...
GitHub user uce opened a pull request: https://github.com/apache/flink/pull/1304 [FLINK-2930] Respect ExecutionConfig execution retry delay This only affects job recovery on non-master failures. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink delay Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1304.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1304 commit 7c62b7ef2b30d9a95eb661c34032d705e9e23cf4 Author: Ufuk CelebiDate: 2015-10-27T17:15:15Z [FLINK-2930] Respect ExecutionConfig execution retry delay --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2930) ExecutionConfig execution retry delay not respected
[ https://issues.apache.org/jira/browse/FLINK-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976769#comment-14976769 ] ASF GitHub Bot commented on FLINK-2930: --- GitHub user uce opened a pull request: https://github.com/apache/flink/pull/1304 [FLINK-2930] Respect ExecutionConfig execution retry delay This only affects job recovery on non-master failures. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink delay Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1304.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1304 commit 7c62b7ef2b30d9a95eb661c34032d705e9e23cf4 Author: Ufuk CelebiDate: 2015-10-27T17:15:15Z [FLINK-2930] Respect ExecutionConfig execution retry delay > ExecutionConfig execution retry delay not respected > --- > > Key: FLINK-2930 > URL: https://issues.apache.org/jira/browse/FLINK-2930 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: 0.10 >Reporter: Ufuk Celebi > > Setting the execution retry delay via the ExecutionConfig is not respected by > the ExecutionGraph on restarts (this is only relevant for non-master > failures). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Add org.apache.httpcomponents:(httpcore, httpc...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/1301#issuecomment-151577309 +1 looks good. will merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43158439 --- Diff: flink-contrib/flink-storm/src/main/java/org/apache/flink/storm/util/SplitStreamType.java --- @@ -20,10 +20,11 @@ import org.apache.flink.streaming.api.datastream.DataStream; /** - * Used by {@link org.apache.flink.storm.wrappers.AbstractStormCollector AbstractStormCollector} to wrap + * Used by org.apache.flink.storm.wrappers.AbstractStormCollector to wrap --- End diff -- Sounds good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976832#comment-14976832 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43158439 --- Diff: flink-contrib/flink-storm/src/main/java/org/apache/flink/storm/util/SplitStreamType.java --- @@ -20,10 +20,11 @@ import org.apache.flink.streaming.api.datastream.DataStream; /** - * Used by {@link org.apache.flink.storm.wrappers.AbstractStormCollector AbstractStormCollector} to wrap + * Used by org.apache.flink.storm.wrappers.AbstractStormCollector to wrap --- End diff -- Sounds good to me. > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2931) Add global recovery timestamp to state restore
Gyula Fora created FLINK-2931: - Summary: Add global recovery timestamp to state restore Key: FLINK-2931 URL: https://issues.apache.org/jira/browse/FLINK-2931 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora Assignee: Gyula Fora While there is information about a global checkpoint timestamp on snapshots, there is no global recovery timestamp available for restore which may be necessary for some more advanced restore logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2927] [runtime] Provide default require...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/1303#issuecomment-151575894 +1 will merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976805#comment-14976805 ] ASF GitHub Bot commented on FLINK-2559: --- Github user hczerpak commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43156806 --- Diff: flink-contrib/flink-storm/src/main/java/org/apache/flink/storm/util/SplitStreamType.java --- @@ -20,10 +20,11 @@ import org.apache.flink.streaming.api.datastream.DataStream; /** - * Used by {@link org.apache.flink.storm.wrappers.AbstractStormCollector AbstractStormCollector} to wrap + * Used by org.apache.flink.storm.wrappers.AbstractStormCollector to wrap --- End diff -- Javadoc couldn't link that class. I'm suspecting because AbstractStormCollector has package access and this link has been created from outside of that package. The least I could do was to leave raw path to that class. > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user hczerpak commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43157767 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -314,7 +314,7 @@ public void remove() { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- There is no IterationConfiguration#setOptDegrees(boolean) any more and if you look at the source code which this javadoc relates to it. It looked like copy-paste comment from somewhere else. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976817#comment-14976817 ] ASF GitHub Bot commented on FLINK-2559: --- Github user hczerpak commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43157767 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -314,7 +314,7 @@ public void remove() { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- There is no IterationConfiguration#setOptDegrees(boolean) any more and if you look at the source code which this javadoc relates to it. It looked like copy-paste comment from somewhere else. > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2929) Recovery of jobs on cluster restarts
[ https://issues.apache.org/jira/browse/FLINK-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976712#comment-14976712 ] Aljoscha Krettek commented on FLINK-2929: - I think we have to fix it, yes. I'm not sure which should be the default behavior though. I gravitate towards making recovery of old jobs the default. But I see how it could be confusing... > Recovery of jobs on cluster restarts > > > Key: FLINK-2929 > URL: https://issues.apache.org/jira/browse/FLINK-2929 > Project: Flink > Issue Type: Improvement >Affects Versions: 0.10 >Reporter: Ufuk Celebi > > Recovery information is stored in ZooKeeper under a static root like > {{/flink}}. In case of a cluster restart without canceling running jobs old > jobs will be recovered from ZooKeeper. > This can be confusing or helpful depending on the use case. > I suspect that the confusing case will be more common. > We can change the default cluster start up (e.g. new YARN session or new > ./start-cluster call) to purge all existing data in ZooKeeper and add a flag > to not do this if needed. > [~trohrm...@apache.org], [~aljoscha], [~StephanEwen] what's your opinion? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976758#comment-14976758 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153436 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/VertexUpdateFunction.java --- @@ -219,7 +219,7 @@ void setInDegree(long inDegree) { /** * Retrieve the vertex out-degree (number of out-going edges). -* @return The out-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The out-degree of this vertex --- End diff -- the `if` part is missing > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153428 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/VertexUpdateFunction.java --- @@ -206,7 +206,7 @@ void setOutput(VertexoutVal, Collector > out) { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- the `if` part is missing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976757#comment-14976757 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153428 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/VertexUpdateFunction.java --- @@ -206,7 +206,7 @@ void setOutput(VertexoutVal, Collector > out) { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- the `if` part is missing > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Map json = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153436 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/VertexUpdateFunction.java --- @@ -219,7 +219,7 @@ void setInDegree(long inDegree) { /** * Retrieve the vertex out-degree (number of out-going edges). -* @return The out-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The out-degree of this vertex --- End diff -- the `if` part is missing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user hczerpak closed the pull request at: https://github.com/apache/flink/pull/1298 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43158080 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -314,7 +314,7 @@ public void remove() { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- I see, but the next line should be removed then as well, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
GitHub user hczerpak reopened a pull request: https://github.com/apache/flink/pull/1298 [FLINK-2559] Fix Javadoc Code Examples Initially I've made only handful of fixes to javadocs, replacing few @ with {@literal @}. Running mvn javadoc:javadoc revealed lots of javadoc problems: - broken html tags: `, , .` - unclosed html tags e.g. `...` - lots of > and < characters used directly in javadoc - source code examples not wrapped with {@code } - incorrect references to classes, methods in @see or @link tags - @throws tags when no exception is being thrown (or different) - no @throws when exception is being thrown from method - typos Unfortunately Travis doesn't run javadocs compilation and it will not show that it actually works You can merge this pull request into a Git repository by running: $ git pull https://github.com/hczerpak/flink FLINK-2559 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1298.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1298 commit 06caac6c90d8c888c326fe9b41552b2241ba584c Author: hczerpakDate: 2015-10-22T10:10:08Z Merge remote-tracking branch 'apache/master' commit 63bd9833bb6e396ea91f3b98e863079e0d764773 Author: hczerpak Date: 2015-10-22T13:27:18Z Merge remote-tracking branch 'apache/master' commit 3d86dd79e0735b1ddb052d07bc23f476d4512168 Author: Hubert Czerpak Date: 2015-10-22T19:42:31Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit d4f770aed5ecb61ab66f442e7d5e28ef7b18338e Author: Hubert Czerpak Date: 2015-10-22T21:51:25Z @literal Replaced @ character with {@literal @} in few places. Not so many occurrences. One was not needed to be html encoded. commit 63f6dd07252ebbfc583e6de2520e15b88332ad97 Author: Hubert Czerpak Date: 2015-10-23T09:15:37Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit 23768356487f4f062f491f7617cb8ebfeb952392 Author: Hubert Czerpak Date: 2015-10-23T10:31:22Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit 516be48b02016a68ef049f9074326eb1f75f7e6c Author: Hubert Czerpak Date: 2015-10-23T14:42:11Z all javadoc is building fine now Removed all javadoc compilation errors commit 679630d13ec4909025f4f9aa4bddfd492824c58a Author: Hubert Czerpak Date: 2015-10-23T14:42:24Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit 940e1f317b7a5af613ea551cd8978cc799d6fac1 Author: Hubert Czerpak Date: 2015-10-23T17:17:24Z Merge remote-tracking branch 'apache/master' into FLINK-2559 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user hczerpak commented on the pull request: https://github.com/apache/flink/pull/1298#issuecomment-151583330 Sure. I appreciate your comments. It was not always straightforward what to do with errors. Could you please take a look at my comments above and say what you think? thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976824#comment-14976824 ] ASF GitHub Bot commented on FLINK-2559: --- Github user hczerpak closed the pull request at: https://github.com/apache/flink/pull/1298 > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976822#comment-14976822 ] ASF GitHub Bot commented on FLINK-2559: --- Github user hczerpak commented on the pull request: https://github.com/apache/flink/pull/1298#issuecomment-151583330 Sure. I appreciate your comments. It was not always straightforward what to do with errors. Could you please take a look at my comments above and say what you think? thanks > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976823#comment-14976823 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43158080 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -314,7 +314,7 @@ public void remove() { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- I see, but the next line should be removed then as well, right? > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1298#issuecomment-151576323 Hi @hczerpak, thanks a lot for the thorough cleaning of the JavaDocs! I spotted just a few things that should be improved. The commit history of your PR is also a bit messy. Would you mind to rebase it to get rid of the merge commits? Thanks again, Fabian --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976779#comment-14976779 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1298#issuecomment-151576323 Hi @hczerpak, thanks a lot for the thorough cleaning of the JavaDocs! I spotted just a few things that should be improved. The commit history of your PR is also a bit messy. Would you mind to rebase it to get rid of the merge commits? Thanks again, Fabian > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Out-of-core state backend for JDBC databases
GitHub user gyfora opened a pull request: https://github.com/apache/flink/pull/1305 Out-of-core state backend for JDBC databases Detailed description incoming... You can merge this pull request into a Git repository by running: $ git pull https://github.com/gyfora/flink master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1305.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1305 commit b793bca20b79c1fe38ed7a31deca485e7d109060 Author: Gyula ForaDate: 2015-10-26T08:58:49Z [FLINK-2916] [streaming] Expose operator and task information to StateBackend commit c35949f5e765f377799730a973b374eeea9c3921 Author: Gyula Fora Date: 2015-10-27T17:31:04Z [FLINK-2924] [streaming] Out-of-core state backend for JDBC databases --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976812#comment-14976812 ] ASF GitHub Bot commented on FLINK-2559: --- Github user hczerpak commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43157321 --- Diff: flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Entities.java --- @@ -21,8 +21,8 @@ import java.util.List; /** - * Entities which have been parsed out of the text of the - * {@link package org.apache.flink.contrib.tweetinputformat.model.tweet.Tweet}. + * entities which have been parsed out of the text of the --- End diff -- Good point. Didn't intend to lower that case. > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user hczerpak commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43157321 --- Diff: flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Entities.java --- @@ -21,8 +21,8 @@ import java.util.List; /** - * Entities which have been parsed out of the text of the - * {@link package org.apache.flink.contrib.tweetinputformat.model.tweet.Tweet}. + * entities which have been parsed out of the text of the --- End diff -- Good point. Didn't intend to lower that case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976724#comment-14976724 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43151435 --- Diff: flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Entities.java --- @@ -21,8 +21,8 @@ import java.util.List; /** - * Entities which have been parsed out of the text of the - * {@link package org.apache.flink.contrib.tweetinputformat.model.tweet.Tweet}. + * entities which have been parsed out of the text of the --- End diff -- Why lowercase? > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43151435 --- Diff: flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Entities.java --- @@ -21,8 +21,8 @@ import java.util.List; /** - * Entities which have been parsed out of the text of the - * {@link package org.apache.flink.contrib.tweetinputformat.model.tweet.Tweet}. + * entities which have been parsed out of the text of the --- End diff -- Why lowercase? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2927] [runtime] Provide default require...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1303 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-7] [Runtime] Enable Range Partitioner.
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1255#discussion_r43156865 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1223,6 +1230,51 @@ public long count() throws Exception { final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keyExtractor, getType()); return new PartitionOperator(this, PartitionMethod.HASH, new Keys.SelectorFunctionKeys(clean(keyExtractor), this.getType(), keyType), Utils.getCallLocationName()); } + + /** +* Range-partitions a DataSet using the specified KeySelector. +* +* Important:This operation shuffles the whole DataSet over the network and can take significant amount of time. +* +* @param keySelector The KeySelector with which the DataSet is range-partitioned. +* @return The partitioned DataSet. +* +* @see KeySelector +*/ + public > DataSet partitionByRange(KeySelector keySelector) { + final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keySelector, getType()); + String callLocation = Utils.getCallLocationName(); + + // Extract key from input element by keySelector. + KeyExtractorMapper keyExtractorMapper = new KeyExtractorMapper (keySelector); --- End diff -- I think you would still have the nodes and all the information of the `OptimizedPlan` available in `connectJobVertices()`. However, I would also be OK to do it as a preprocessing step in `compileJobGraph()`. Let me know if you face any obstacles or have any questions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-7) [GitHub] Enable Range Partitioner
[ https://issues.apache.org/jira/browse/FLINK-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976806#comment-14976806 ] ASF GitHub Bot commented on FLINK-7: Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1255#discussion_r43156865 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java --- @@ -1223,6 +1230,51 @@ public long count() throws Exception { final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keyExtractor, getType()); return new PartitionOperator(this, PartitionMethod.HASH, new Keys.SelectorFunctionKeys(clean(keyExtractor), this.getType(), keyType), Utils.getCallLocationName()); } + + /** +* Range-partitions a DataSet using the specified KeySelector. +* +* Important:This operation shuffles the whole DataSet over the network and can take significant amount of time. +* +* @param keySelector The KeySelector with which the DataSet is range-partitioned. +* @return The partitioned DataSet. +* +* @see KeySelector +*/ + public > DataSet partitionByRange(KeySelector keySelector) { + final TypeInformation keyType = TypeExtractor.getKeySelectorTypes(keySelector, getType()); + String callLocation = Utils.getCallLocationName(); + + // Extract key from input element by keySelector. + KeyExtractorMapper keyExtractorMapper = new KeyExtractorMapper (keySelector); --- End diff -- I think you would still have the nodes and all the information of the `OptimizedPlan` available in `connectJobVertices()`. However, I would also be OK to do it as a preprocessing step in `compileJobGraph()`. Let me know if you face any obstacles or have any questions. > [GitHub] Enable Range Partitioner > - > > Key: FLINK-7 > URL: https://issues.apache.org/jira/browse/FLINK-7 > Project: Flink > Issue Type: Sub-task > Components: Distributed Runtime >Reporter: GitHub Import >Assignee: Chengxiang Li > Fix For: pre-apache > > > The range partitioner is currently disabled. We need to implement the > following aspects: > 1) Distribution information, if available, must be propagated back together > with the ordering property. > 2) A generic bucket lookup structure (currently specific to PactRecord). > Tests to re-enable after fixing this issue: > - TeraSortITCase > - GlobalSortingITCase > - GlobalSortingMixedOrderITCase > Imported from GitHub > Url: https://github.com/stratosphere/stratosphere/issues/7 > Created by: [StephanEwen|https://github.com/StephanEwen] > Labels: core, enhancement, optimizer, > Milestone: Release 0.4 > Assignee: [fhueske|https://github.com/fhueske] > Created at: Fri Apr 26 13:48:24 CEST 2013 > State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976825#comment-14976825 ] ASF GitHub Bot commented on FLINK-2559: --- GitHub user hczerpak reopened a pull request: https://github.com/apache/flink/pull/1298 [FLINK-2559] Fix Javadoc Code Examples Initially I've made only handful of fixes to javadocs, replacing few @ with {@literal @}. Running mvn javadoc:javadoc revealed lots of javadoc problems: - broken html tags: `, , .` - unclosed html tags e.g. `...` - lots of > and < characters used directly in javadoc - source code examples not wrapped with {@code } - incorrect references to classes, methods in @see or @link tags - @throws tags when no exception is being thrown (or different) - no @throws when exception is being thrown from method - typos Unfortunately Travis doesn't run javadocs compilation and it will not show that it actually works You can merge this pull request into a Git repository by running: $ git pull https://github.com/hczerpak/flink FLINK-2559 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1298.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1298 commit 06caac6c90d8c888c326fe9b41552b2241ba584c Author: hczerpakDate: 2015-10-22T10:10:08Z Merge remote-tracking branch 'apache/master' commit 63bd9833bb6e396ea91f3b98e863079e0d764773 Author: hczerpak Date: 2015-10-22T13:27:18Z Merge remote-tracking branch 'apache/master' commit 3d86dd79e0735b1ddb052d07bc23f476d4512168 Author: Hubert Czerpak Date: 2015-10-22T19:42:31Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit d4f770aed5ecb61ab66f442e7d5e28ef7b18338e Author: Hubert Czerpak Date: 2015-10-22T21:51:25Z @literal Replaced @ character with {@literal @} in few places. Not so many occurrences. One was not needed to be html encoded. commit 63f6dd07252ebbfc583e6de2520e15b88332ad97 Author: Hubert Czerpak Date: 2015-10-23T09:15:37Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit 23768356487f4f062f491f7617cb8ebfeb952392 Author: Hubert Czerpak Date: 2015-10-23T10:31:22Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit 516be48b02016a68ef049f9074326eb1f75f7e6c Author: Hubert Czerpak Date: 2015-10-23T14:42:11Z all javadoc is building fine now Removed all javadoc compilation errors commit 679630d13ec4909025f4f9aa4bddfd492824c58a Author: Hubert Czerpak Date: 2015-10-23T14:42:24Z Merge remote-tracking branch 'apache/master' into FLINK-2559 commit 940e1f317b7a5af613ea551cd8978cc799d6fac1 Author: Hubert Czerpak Date: 2015-10-23T17:17:24Z Merge remote-tracking branch 'apache/master' into FLINK-2559 > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Map json = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976830#comment-14976830 ] ASF GitHub Bot commented on FLINK-2559: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1298#issuecomment-151583907 No worries :-) Thanks for the quick response! > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2559) Fix Javadoc Code Examples
[ https://issues.apache.org/jira/browse/FLINK-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976826#comment-14976826 ] ASF GitHub Bot commented on FLINK-2559: --- Github user hczerpak commented on the pull request: https://github.com/apache/flink/pull/1298#issuecomment-151583549 Sorry for missclick. Didn't intend to close and reopen > Fix Javadoc Code Examples > - > > Key: FLINK-2559 > URL: https://issues.apache.org/jira/browse/FLINK-2559 > Project: Flink > Issue Type: Improvement >Reporter: Aljoscha Krettek >Priority: Minor > Labels: starter > > Many multiline Javadoc code examples are not correctly rendered. One of the > problems is that an @ inside a code block breaks the rendering. > This is an example that works: > {code} > * {@code > * private static class MyIndexRequestBuilder implements > IndexRequestBuilder { > * > * public IndexRequest createIndexRequest(String element, > RuntimeContext ctx) { > * Mapjson = new HashMap<>(); > * json.put("data", element); > * > * return Requests.indexRequest() > * .index("my-index") > * .type("my-type") > * .source(json); > * } > * } > * } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user hczerpak commented on the pull request: https://github.com/apache/flink/pull/1298#issuecomment-151583549 Sorry for missclick. Didn't intend to close and reopen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43151247 --- Diff: flink-contrib/flink-storm/src/main/java/org/apache/flink/storm/util/SplitStreamType.java --- @@ -20,10 +20,11 @@ import org.apache.flink.streaming.api.datastream.DataStream; /** - * Used by {@link org.apache.flink.storm.wrappers.AbstractStormCollector AbstractStormCollector} to wrap + * Used by org.apache.flink.storm.wrappers.AbstractStormCollector to wrap --- End diff -- Why did you remove this link? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2559] Fix Javadoc Code Examples
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1298#discussion_r43153252 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargel/MessagingFunction.java --- @@ -314,7 +314,7 @@ public void remove() { /** * Retrieves the vertex in-degree (number of in-coming edges). -* @return The in-degree of this vertex if the {@link IterationConfiguration#setOptDegrees(boolean)} +* @return The in-degree of this vertex --- End diff -- the `if` part is missing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2927) Provide default required configuration keys in flink-conf of binary distribution
[ https://issues.apache.org/jira/browse/FLINK-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976795#comment-14976795 ] ASF GitHub Bot commented on FLINK-2927: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1303 > Provide default required configuration keys in flink-conf of binary > distribution > > > Key: FLINK-2927 > URL: https://issues.apache.org/jira/browse/FLINK-2927 > Project: Flink > Issue Type: Improvement >Affects Versions: 0.10 >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi >Priority: Minor > > The configuration only contains a template with a subset of the required > configuration keys for HA configuration. Add all arguments to make it easy to > figure out how to configure it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Add org.apache.httpcomponents:(httpcore, httpc...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1301 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---