[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507333#comment-14507333 ] Till Rohrmann commented on FLINK-1745: -- Hi Raghav, if I understood it correctly, then approach 1 and 2 are implementing the same approximate kNN algorithm. The difference is only that the first paper implements it on MapReduce and the latter paper on a relational database. I personally think that we should add eventually an approximate kNN implementation to the ML library because we want to scale to large amounts of data. The exact implementations can act as good baseline method, though. The problem with the zkNN IMHO is to calculate the z-value for double based feature vectors. There is another paper http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4118757 which implements a different approximation algorithm for kNN. This might be an alternative to zkNN. Or at least it can act as good comparison for zkNN. Maybe each one of you [~raghav.chalapa...@gmail.com] [~chiwanpark] picks one algorithm to implement and then we give the user of the ML library the choice to select what suits him best. What do you think? Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Márton Balassi updated FLINK-1865: -- Comment: was deleted (was: Go ahead.) Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Márton Balassi {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at
[jira] [Issue Comment Deleted] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Márton Balassi updated FLINK-1865: -- Comment: was deleted (was: Go ahead.) Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Márton Balassi {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at
[jira] [Created] (FLINK-1931) Add ADMM to optimization framework
Till Rohrmann created FLINK-1931: Summary: Add ADMM to optimization framework Key: FLINK-1931 URL: https://issues.apache.org/jira/browse/FLINK-1931 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann With the upcoming optimization framework soon in place we can think about adding more optimization algorithms to it. One new addition could be the alternating direction method of multipliers (ADMM) [1, 2]. Resources [1] http://stanford.edu/~boyd/admm.html [2] http://en.wikipedia.org/wiki/Augmented_Lagrangian_method -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1931) Add ADMM to optimization framework
[ https://issues.apache.org/jira/browse/FLINK-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-1931: - Labels: ML (was: ) Add ADMM to optimization framework -- Key: FLINK-1931 URL: https://issues.apache.org/jira/browse/FLINK-1931 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Labels: ML With the upcoming optimization framework soon in place we can think about adding more optimization algorithms to it. One new addition could be the alternating direction method of multipliers (ADMM) [1, 2]. Resources [1] http://stanford.edu/~boyd/admm.html [2] http://en.wikipedia.org/wiki/Augmented_Lagrangian_method -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo
[ https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507004#comment-14507004 ] ASF GitHub Bot commented on FLINK-1909: --- Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/610#discussion_r28868351 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java --- @@ -632,15 +608,18 @@ public void registerType(Class? type) { * @return the data stream constructed */ @SuppressWarnings(unchecked) - private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, - TypeInformationOUT outTypeInfo, String sourceName) { + private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, String sourceName) { + + TypeInformationOUT outTypeInfo; - if (outTypeInfo == null) { - if (function instanceof GenericSourceFunction) { - outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); - } else { + if (function instanceof GenericSourceFunction) { + outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); + } else { + try { outTypeInfo = TypeExtractor.createTypeInfo(SourceFunction.class, function.getClass(), 0, null, null); + } catch (InvalidTypesException e) { + outTypeInfo = (TypeInformationOUT) new MissingTypeInfo(Custom source, e); --- End diff -- This is the same mechanism that all flink operators use to allow missing types that can be filled with .returns(..) afterwards. Refactor streaming scala api to use returns for adding typeinfo --- Key: FLINK-1909 URL: https://issues.apache.org/jira/browse/FLINK-1909 Project: Flink Issue Type: Improvement Components: Scala API, Streaming Reporter: Gyula Fora Assignee: Gyula Fora Currently the streaming scala api uses transform to pass the extracted type information instead of .returns. This leads to a lot of code duplication. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507058#comment-14507058 ] Robert Metzger commented on FLINK-1865: --- The issue still persists (this is a build from today's master): https://s3.amazonaws.com/archive.travis-ci.org/jobs/59538212/log.txt Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Márton Balassi {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507069#comment-14507069 ] Robert Metzger commented on FLINK-1865: --- I have written to the Kafka Mailing list to ask them regarding our PersistentKafkaSource stuff to resolve all the Kafka issues: http://mail-archives.apache.org/mod_mbox/kafka-users/201504.mbox/%3CCAGr9p8Dvx1OKo0q4iTtR7ped7rxgcaHKpbndo3imuJzLvuG03Q%40mail.gmail.com%3E When we find a way to use the high level consumer with manual offset control this issue will certainly be resolved. [~bamrabi]: is it okay for you when I'm assigning the issue to myself? Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Márton Balassi {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at
[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo
[ https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507012#comment-14507012 ] ASF GitHub Bot commented on FLINK-1909: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/610#discussion_r28868831 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java --- @@ -632,15 +608,18 @@ public void registerType(Class? type) { * @return the data stream constructed */ @SuppressWarnings(unchecked) - private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, - TypeInformationOUT outTypeInfo, String sourceName) { + private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, String sourceName) { + + TypeInformationOUT outTypeInfo; - if (outTypeInfo == null) { - if (function instanceof GenericSourceFunction) { - outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); - } else { + if (function instanceof GenericSourceFunction) { + outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); + } else { + try { outTypeInfo = TypeExtractor.createTypeInfo(SourceFunction.class, function.getClass(), 0, null, null); + } catch (InvalidTypesException e) { + outTypeInfo = (TypeInformationOUT) new MissingTypeInfo(Custom source, e); --- End diff -- Okay, I see Refactor streaming scala api to use returns for adding typeinfo --- Key: FLINK-1909 URL: https://issues.apache.org/jira/browse/FLINK-1909 Project: Flink Issue Type: Improvement Components: Scala API, Streaming Reporter: Gyula Fora Assignee: Gyula Fora Currently the streaming scala api uses transform to pass the extracted type information instead of .returns. This leads to a lot of code duplication. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507361#comment-14507361 ] Márton Balassi commented on FLINK-1865: --- Go ahead. Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Márton Balassi {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507359#comment-14507359 ] Márton Balassi commented on FLINK-1865: --- Go ahead. Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Márton Balassi {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507360#comment-14507360 ] Márton Balassi commented on FLINK-1865: --- Go ahead. Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Márton Balassi {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
[jira] [Created] (FLINK-1932) Add L-BFGS to the optimization framework
Till Rohrmann created FLINK-1932: Summary: Add L-BFGS to the optimization framework Key: FLINK-1932 URL: https://issues.apache.org/jira/browse/FLINK-1932 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann A good candidate to add to the new optimization framework could be L-BFGS [1, 2]. Resources: [1] http://papers.nips.cc/paper/5333-large-scale-l-bfgs-using-mapreduce.pdf [2] http://en.wikipedia.org/wiki/Limited-memory_BFGS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1422) Missing usage example for withParameters
[ https://issues.apache.org/jira/browse/FLINK-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507190#comment-14507190 ] ASF GitHub Bot commented on FLINK-1422: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/350 Missing usage example for withParameters -- Key: FLINK-1422 URL: https://issues.apache.org/jira/browse/FLINK-1422 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 0.8.0 Reporter: Alexander Alexandrov Priority: Trivial Fix For: 0.8.2 Original Estimate: 1h Remaining Estimate: 1h I am struggling to find a usage example of the withParameters method in the documentation. At the moment I only see this note: {quote} Note: As the content of broadcast variables is kept in-memory on each node, it should not become too large. For simpler things like scalar values you can simply make parameters part of the closure of a function, or use the withParameters(...) method to pass in a configuration. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507189#comment-14507189 ] Vasia Kalavri commented on FLINK-1930: -- Hi [~rmetzger]! No, I'm not sure what's the root cause. I bumped into this when running ~1 month old code (which used to run fine), so it must be something recently introduced. NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1422] Add withParameters() to documenta...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/350 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Periodic full stream aggregates added + partit...
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/614#issuecomment-95157387 The current stream reduces only keep 1 element as their state, the problem is that if you run it in parallel, each node will keep a partial reduce. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo
[ https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507002#comment-14507002 ] ASF GitHub Bot commented on FLINK-1909: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/610#discussion_r28868234 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java --- @@ -632,15 +608,18 @@ public void registerType(Class? type) { * @return the data stream constructed */ @SuppressWarnings(unchecked) - private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, - TypeInformationOUT outTypeInfo, String sourceName) { + private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, String sourceName) { + + TypeInformationOUT outTypeInfo; - if (outTypeInfo == null) { - if (function instanceof GenericSourceFunction) { - outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); - } else { + if (function instanceof GenericSourceFunction) { + outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); + } else { + try { outTypeInfo = TypeExtractor.createTypeInfo(SourceFunction.class, function.getClass(), 0, null, null); + } catch (InvalidTypesException e) { + outTypeInfo = (TypeInformationOUT) new MissingTypeInfo(Custom source, e); --- End diff -- Why did you change this? I suspect the exception will now be thrown somewhere else? Refactor streaming scala api to use returns for adding typeinfo --- Key: FLINK-1909 URL: https://issues.apache.org/jira/browse/FLINK-1909 Project: Flink Issue Type: Improvement Components: Scala API, Streaming Reporter: Gyula Fora Assignee: Gyula Fora Currently the streaming scala api uses transform to pass the extracted type information instead of .returns. This leads to a lot of code duplication. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/610#discussion_r28868831 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java --- @@ -632,15 +608,18 @@ public void registerType(Class? type) { * @return the data stream constructed */ @SuppressWarnings(unchecked) - private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, - TypeInformationOUT outTypeInfo, String sourceName) { + private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, String sourceName) { + + TypeInformationOUT outTypeInfo; - if (outTypeInfo == null) { - if (function instanceof GenericSourceFunction) { - outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); - } else { + if (function instanceof GenericSourceFunction) { + outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); + } else { + try { outTypeInfo = TypeExtractor.createTypeInfo(SourceFunction.class, function.getClass(), 0, null, null); + } catch (InvalidTypesException e) { + outTypeInfo = (TypeInformationOUT) new MissingTypeInfo(Custom source, e); --- End diff -- Okay, I see --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/617#issuecomment-95173776 No, I would not add a dependency there. Maybe a comment that we've managed a transitive dependency of hbase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...
Github user gyfora closed the pull request at: https://github.com/apache/flink/pull/610 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo
[ https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507131#comment-14507131 ] ASF GitHub Bot commented on FLINK-1909: --- Github user gyfora closed the pull request at: https://github.com/apache/flink/pull/610 Refactor streaming scala api to use returns for adding typeinfo --- Key: FLINK-1909 URL: https://issues.apache.org/jira/browse/FLINK-1909 Project: Flink Issue Type: Improvement Components: Scala API, Streaming Reporter: Gyula Fora Assignee: Gyula Fora Currently the streaming scala api uses transform to pass the extracted type information instead of .returns. This leads to a lot of code duplication. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1422] Add withParameters() to documenta...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/350#issuecomment-95206967 Thanks, Chesnay! I've added the constructor parametrization, fixed the code examples, and rebased on the lastest master. Going to merge it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
GitHub user uce opened a pull request: https://github.com/apache/flink/pull/617 [runtime] Bump Netty version to 4.27.Final and add javassist @rmetzger, javassist is set in the root POM. Is it OK to leave it in flink-runtime as I have it now w/o version? You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/incubator-flink javassist Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/617.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #617 commit 6a91d722005d7f8d138e8553f92a134c52209de8 Author: Ufuk Celebi u...@apache.org Date: 2015-04-22T12:50:38Z [runtime] Bump Netty version to 4.27.Final and add javassist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1875] Add figure explaining slots and p...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/604 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1875) Add figure to documentation describing slots and parallelism
[ https://issues.apache.org/jira/browse/FLINK-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507080#comment-14507080 ] ASF GitHub Bot commented on FLINK-1875: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/604 Add figure to documentation describing slots and parallelism Key: FLINK-1875 URL: https://issues.apache.org/jira/browse/FLINK-1875 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Our users are still confused how parallelism and slots are connected to each other. We tried addressing this issue already with FLINK-1679, but I think we also need to have a nice picture in our documentation. This is too complicated: http://ci.apache.org/projects/flink/flink-docs-master/internal_job_scheduling.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [docs] Change doc layout
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/606 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/521#issuecomment-95199623 Hey Peti, I added a refactor commit that reworked the way types are handled in the sources a little bit. https://github.com/apache/flink/commit/6df1dd2cc848d0a691a98309a3bb760f9a998673 Can you please rebase on that one? It should make things nicer. Gyula --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API
[ https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507148#comment-14507148 ] ASF GitHub Bot commented on FLINK-1687: --- Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/521#issuecomment-95199623 Hey Peti, I added a refactor commit that reworked the way types are handled in the sources a little bit. https://github.com/apache/flink/commit/6df1dd2cc848d0a691a98309a3bb760f9a998673 Can you please rebase on that one? It should make things nicer. Gyula Streaming file source/sink API is not in sync with the batch API Key: FLINK-1687 URL: https://issues.apache.org/jira/browse/FLINK-1687 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gábor Hermann Assignee: Péter Szabó Streaming environment is missing file inputs like readFile, readCsvFile and also the more general createInput function, and outputs like writeAsCsv and write. Streaming and batch API should be consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...
Github user fpompermaier commented on a diff in the pull request: https://github.com/apache/flink/pull/571#discussion_r28869610 --- Diff: flink-staging/flink-hbase/pom.xml --- @@ -112,6 +112,12 @@ under the License. /exclusion /exclusions /dependency + dependency --- End diff -- Unfortunately the TableInputFormat and TableOutputFormat are in the server jar. For the read we've reimplemented it to make it more robust so we don't need that jar, but for the output it is indeed required. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table
[ https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507025#comment-14507025 ] ASF GitHub Bot commented on FLINK-1828: --- Github user fpompermaier commented on a diff in the pull request: https://github.com/apache/flink/pull/571#discussion_r28869610 --- Diff: flink-staging/flink-hbase/pom.xml --- @@ -112,6 +112,12 @@ under the License. /exclusion /exclusions /dependency + dependency --- End diff -- Unfortunately the TableInputFormat and TableOutputFormat are in the server jar. For the read we've reimplemented it to make it more robust so we don't need that jar, but for the output it is indeed required. Impossible to output data to an HBase table --- Key: FLINK-1828 URL: https://issues.apache.org/jira/browse/FLINK-1828 Project: Flink Issue Type: Bug Components: Hadoop Compatibility Affects Versions: 0.9 Reporter: Flavio Pompermaier Labels: hadoop, hbase Fix For: 0.9 Right now it is not possible to use HBase TableOutputFormat as output format because Configurable.setConf is not called in the configure() method of the HadoopOutputFormatBase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/610#discussion_r28868234 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java --- @@ -632,15 +608,18 @@ public void registerType(Class? type) { * @return the data stream constructed */ @SuppressWarnings(unchecked) - private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, - TypeInformationOUT outTypeInfo, String sourceName) { + private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, String sourceName) { + + TypeInformationOUT outTypeInfo; - if (outTypeInfo == null) { - if (function instanceof GenericSourceFunction) { - outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); - } else { + if (function instanceof GenericSourceFunction) { + outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); + } else { + try { outTypeInfo = TypeExtractor.createTypeInfo(SourceFunction.class, function.getClass(), 0, null, null); + } catch (InvalidTypesException e) { + outTypeInfo = (TypeInformationOUT) new MissingTypeInfo(Custom source, e); --- End diff -- Why did you change this? I suspect the exception will now be thrown somewhere else? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/617#issuecomment-95163295 Ah .. we didn't change the javaassist version ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/617#issuecomment-95169398 Yap. I've just seen that there is no reference to javassist in the flink-hbase pom. Is that intentional? Shouldn't we also add a dependency there w/o the version? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/610#discussion_r28868351 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java --- @@ -632,15 +608,18 @@ public void registerType(Class? type) { * @return the data stream constructed */ @SuppressWarnings(unchecked) - private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, - TypeInformationOUT outTypeInfo, String sourceName) { + private OUT DataStreamSourceOUT addSource(SourceFunctionOUT function, String sourceName) { + + TypeInformationOUT outTypeInfo; - if (outTypeInfo == null) { - if (function instanceof GenericSourceFunction) { - outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); - } else { + if (function instanceof GenericSourceFunction) { + outTypeInfo = ((GenericSourceFunctionOUT) function).getType(); + } else { + try { outTypeInfo = TypeExtractor.createTypeInfo(SourceFunction.class, function.getClass(), 0, null, null); + } catch (InvalidTypesException e) { + outTypeInfo = (TypeInformationOUT) new MissingTypeInfo(Custom source, e); --- End diff -- This is the same mechanism that all flink operators use to allow missing types that can be filled with .returns(..) afterwards. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1929) Add code to cleanly stop a running streaming topology
Robert Metzger created FLINK-1929: - Summary: Add code to cleanly stop a running streaming topology Key: FLINK-1929 URL: https://issues.apache.org/jira/browse/FLINK-1929 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 0.9 Reporter: Robert Metzger Right now its not possible to cleanly stop a running Streaming topology. Cancelling the job will cancel all operators, but for proper exactly once processing from Kafka sources, we need to provide a way to stop the sources first, wait until all remaining tuples have been processed and then shut down the sources (so that they can commit the right offset to Zookeeper). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [docs] Change doc layout
Github user ktzoumas commented on the pull request: https://github.com/apache/flink/pull/606#issuecomment-95174641 Good to merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo
[ https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora resolved FLINK-1909. --- Resolution: Fixed https://github.com/apache/flink/commit/6df1dd2cc848d0a691a98309a3bb760f9a998673 Refactor streaming scala api to use returns for adding typeinfo --- Key: FLINK-1909 URL: https://issues.apache.org/jira/browse/FLINK-1909 Project: Flink Issue Type: Improvement Components: Scala API, Streaming Reporter: Gyula Fora Assignee: Gyula Fora Currently the streaming scala api uses transform to pass the extracted type information instead of .returns. This leads to a lot of code duplication. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1422) Missing usage example for withParameters
[ https://issues.apache.org/jira/browse/FLINK-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507163#comment-14507163 ] ASF GitHub Bot commented on FLINK-1422: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/350#issuecomment-95206967 Thanks, Chesnay! I've added the constructor parametrization, fixed the code examples, and rebased on the lastest master. Going to merge it. Missing usage example for withParameters -- Key: FLINK-1422 URL: https://issues.apache.org/jira/browse/FLINK-1422 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 0.8.0 Reporter: Alexander Alexandrov Priority: Trivial Fix For: 0.8.2 Original Estimate: 1h Remaining Estimate: 1h I am struggling to find a usage example of the withParameters method in the documentation. At the moment I only see this note: {quote} Note: As the content of broadcast variables is kept in-memory on each node, it should not become too large. For simpler things like scalar values you can simply make parameters part of the closure of a function, or use the withParameters(...) method to pass in a configuration. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1930) NullPointerException in vertex-centric iteration
Vasia Kalavri created FLINK-1930: Summary: NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507179#comment-14507179 ] Robert Metzger commented on FLINK-1930: --- I've seen this issue also recently in a streaming job. Are you sure that this issue is the root cause? Or has the job errored before? NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1907) Scala Interactive Shell
[ https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507306#comment-14507306 ] Nikolaas Steenbergen commented on FLINK-1907: - I've added a class to start up the shell from within flink (org.apache.flink.api.scala.FlinkShell), So it can be started from the Flink jar directly. Simply writing out the compiled lines of the wordcount example and then putting them in a jar (jar cf output.jar inputDir ) and uploading the jar to a local cluster (sh start-local.sh) via the webclient leads to: org.apache.flink.client.program.ProgramInvocationException: Neither a 'Main-Class', nor a 'program-class' entry was found in the jar file. [..] Do you have a suggestion what the easiest way is to put the single compiled shell commands in a format that is executable? Scala Interactive Shell --- Key: FLINK-1907 URL: https://issues.apache.org/jira/browse/FLINK-1907 Project: Flink Issue Type: New Feature Components: Scala API Reporter: Nikolaas Steenbergen Assignee: Nikolaas Steenbergen Priority: Minor Build an interactive Shell for the Scala api. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/617#issuecomment-95218150 OK, then this is good to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Improve error messages on Task deployment
Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/615#issuecomment-95223394 +1 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507290#comment-14507290 ] Stephan Ewen commented on FLINK-1930: - I have a bit of code pending that may help to figure out whether this is a side-effect of cancelling, or a bug in the buffer pools. I hope I can commit that soon... NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507556#comment-14507556 ] Chiwan Park commented on FLINK-1745: Hi. About the model, I think that grouping the neighbours with respect to each of the queried points is better. Following code can be example. {code} val result: DataSet[Tuple2[Vector, Array[Vector]]] = model.transform(testingDS) {code} It's great idea that we give the user the choice to select best algorithm. I think it seems better to split this issue into two or more issues. (Exact kNN (BRJ), Approximate kNN (zkNN, kNN based hybrid spill tree), Distance Measure). I hope to implement exact kNN and distance measure first, and try about approximate kNN. Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API
[ https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507675#comment-14507675 ] ASF GitHub Bot commented on FLINK-1687: --- Github user szape commented on the pull request: https://github.com/apache/flink/pull/521#issuecomment-95303397 @gyfora Of, course, no problem. It will make the source API cleaner. Peter Streaming file source/sink API is not in sync with the batch API Key: FLINK-1687 URL: https://issues.apache.org/jira/browse/FLINK-1687 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gábor Hermann Assignee: Péter Szabó Streaming environment is missing file inputs like readFile, readCsvFile and also the more general createInput function, and outputs like writeAsCsv and write. Streaming and batch API should be consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1653) Setting up Apache Jenkins CI for continuous tests
[ https://issues.apache.org/jira/browse/FLINK-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Saputra closed FLINK-1653. Resolution: Fixed With increase Travis capacity for ASF lets close this one for now. Setting up Apache Jenkins CI for continuous tests - Key: FLINK-1653 URL: https://issues.apache.org/jira/browse/FLINK-1653 Project: Flink Issue Type: Task Components: Build System Reporter: Henry Saputra Assignee: Henry Saputra Priority: Minor We already have Travis build for Apache Flink Github mirror. This task is used to track effort to setup Flink Jenkins CI in ASF environment [1] [1] https://builds.apache.org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508420#comment-14508420 ] Chiwan Park commented on FLINK-1745: Okay, thanks [~raghav.chalapa...@gmail.com] for concessions. I will create issue for distance measure and implement both distance measure and exact kNN implementation as soon as possible. :) Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Periodic full stream aggregates added + partit...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/614 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508143#comment-14508143 ] Raghav Chalapathy commented on FLINK-1745: -- Hi I agree we with the idea to provide users the choice , Chiwan since you are hoping to taking up the exact KNN implementation Shall I go ahead and work with other issues and wait for your exact implementation and add approximate Knn later on ; I feel we need to provide the user the results of all the best possible algorithms and compare the results as given by H2o deep learning going forward so that user can see the performance based on the various algorithms picked with regards Raghav Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Add DEPRECATED annotations in Spargel APIs and...
GitHub user hsaputra opened a pull request: https://github.com/apache/flink/pull/618 Add DEPRECATED annotations in Spargel APIs and update the doc to refer to Gelly for Flink graph processing. Mark all user-facing from the Spargel API as deprecated and add a comment in the docs and point people to Gelly. This should help migration to Gelly in the next release. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hsaputra/flink FLINK-1693_deprecate_spargel_apis Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/618.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #618 commit 6e978948c86d58fa773af728fbd5e497b82b Author: hsaputra hsapu...@apache.org Date: 2015-04-22T21:10:29Z Add DEPRECATED annotations in Spargel APIs and update the doc to refer to Gelly for Flink graph processing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Add DEPRECATED annotations in Spargel APIs and...
Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/618#issuecomment-95337327 CC @vasia --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1933) Add distance measure interface and basic implementation to machine learning library
Chiwan Park created FLINK-1933: -- Summary: Add distance measure interface and basic implementation to machine learning library Key: FLINK-1933 URL: https://issues.apache.org/jira/browse/FLINK-1933 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Chiwan Park Assignee: Chiwan Park Add distance measure interface to calculate distance between two vectors and some implementations of the interface. In FLINK-1745, [~till.rohrmann] suggests a interface following: {code} trait DistanceMeasure { def distance(a: Vector, b: Vector): Double } {code} I think that following list of implementation is sufficient to provide first to ML library users. * Manhattan distance [1] * Cosine distance [2] * Euclidean distance (and Squared) [3] * Tanimoto distance [4] * Minkowski distance [5] * Chebyshev distance [6] [1]: http://en.wikipedia.org/wiki/Taxicab_geometry [2]: http://en.wikipedia.org/wiki/Cosine_similarity [3]: http://en.wikipedia.org/wiki/Euclidean_distance [4]: http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29 [5]: http://en.wikipedia.org/wiki/Minkowski_distance [6]: http://en.wikipedia.org/wiki/Chebyshev_distance -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/617#issuecomment-95162838 Are we sure that HBase is still working? I think there are no automated tests for Hbase ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1875) Add figure to documentation describing slots and parallelism
[ https://issues.apache.org/jira/browse/FLINK-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1875. --- Resolution: Fixed Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/db608332 Add figure to documentation describing slots and parallelism Key: FLINK-1875 URL: https://issues.apache.org/jira/browse/FLINK-1875 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Our users are still confused how parallelism and slots are connected to each other. We tried addressing this issue already with FLINK-1679, but I think we also need to have a nice picture in our documentation. This is too complicated: http://ci.apache.org/projects/flink/flink-docs-master/internal_job_scheduling.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging
Robert Metzger created FLINK-1923: - Summary: Replace asynchronous logging in JobManager with regular slf4j logging Key: FLINK-1923 URL: https://issues.apache.org/jira/browse/FLINK-1923 Project: Flink Issue Type: Task Components: JobManager Affects Versions: 0.9 Reporter: Robert Metzger Its hard to understand exactly whats going on in the JobManager because the log messages are send asynchronously by a logging actor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD
Till Rohrmann created FLINK-1925: Summary: Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD Key: FLINK-1925 URL: https://issues.apache.org/jira/browse/FLINK-1925 Project: Flink Issue Type: Improvement Reporter: Till Rohrmann Assignee: Till Rohrmann ResearchGate reported that a job times out while submitting tasks to the TaskManager. The reason is that the JobManager expects a TaskOperationResult response upon submitting a task to the TM. The TM downloads then the required jars from the JM which blocks the actor thread and can take a very long time if many TMs download from the JM. Due to this, the SubmitTask future throws a TimeOutException. A possible solution could be that the TM eagerly acknowledges the reception of the SubmitTask message and executes the task initialization within a future. The future will upon completion send a UpdateTaskExecutionState message to the JM which switches the state of the task from deploying to running. This means that the handler of SubmitTask future in {{Execution}} won't change the state of the task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1924] Minor Refactoring
GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/616 [FLINK-1924] Minor Refactoring This PR resolves a few minor issues, including formatting simpler python process initialization renaming of the python connection following the switch to tcp You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink python_refactor2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/616.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #616 commit 81d4c197bab6de4fb45f19ba2ca06f1f042c1812 Author: zentol s.mo...@web.de Date: 2015-03-27T10:53:23Z [FLINK-1924] Minor Refactoring --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1924) [Py] Refactor a few minor things
[ https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506778#comment-14506778 ] ASF GitHub Bot commented on FLINK-1924: --- GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/616 [FLINK-1924] Minor Refactoring This PR resolves a few minor issues, including formatting simpler python process initialization renaming of the python connection following the switch to tcp You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink python_refactor2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/616.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #616 commit 81d4c197bab6de4fb45f19ba2ca06f1f042c1812 Author: zentol s.mo...@web.de Date: 2015-03-27T10:53:23Z [FLINK-1924] Minor Refactoring [Py] Refactor a few minor things Key: FLINK-1924 URL: https://issues.apache.org/jira/browse/FLINK-1924 Project: Flink Issue Type: Improvement Components: Python API Affects Versions: 0.8.1 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Priority: Trivial Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD
[ https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-1925: - Description: A user reported that a job times out while submitting tasks to the TaskManager. The reason is that the JobManager expects a TaskOperationResult response upon submitting a task to the TM. The TM downloads then the required jars from the JM which blocks the actor thread and can take a very long time if many TMs download from the JM. Due to this, the SubmitTask future throws a TimeOutException. A possible solution could be that the TM eagerly acknowledges the reception of the SubmitTask message and executes the task initialization within a future. The future will upon completion send a UpdateTaskExecutionState message to the JM which switches the state of the task from deploying to running. This means that the handler of SubmitTask future in {{Execution}} won't change the state of the task. was: ResearchGate reported that a job times out while submitting tasks to the TaskManager. The reason is that the JobManager expects a TaskOperationResult response upon submitting a task to the TM. The TM downloads then the required jars from the JM which blocks the actor thread and can take a very long time if many TMs download from the JM. Due to this, the SubmitTask future throws a TimeOutException. A possible solution could be that the TM eagerly acknowledges the reception of the SubmitTask message and executes the task initialization within a future. The future will upon completion send a UpdateTaskExecutionState message to the JM which switches the state of the task from deploying to running. This means that the handler of SubmitTask future in {{Execution}} won't change the state of the task. Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD Key: FLINK-1925 URL: https://issues.apache.org/jira/browse/FLINK-1925 Project: Flink Issue Type: Improvement Reporter: Till Rohrmann Assignee: Till Rohrmann A user reported that a job times out while submitting tasks to the TaskManager. The reason is that the JobManager expects a TaskOperationResult response upon submitting a task to the TM. The TM downloads then the required jars from the JM which blocks the actor thread and can take a very long time if many TMs download from the JM. Due to this, the SubmitTask future throws a TimeOutException. A possible solution could be that the TM eagerly acknowledges the reception of the SubmitTask message and executes the task initialization within a future. The future will upon completion send a UpdateTaskExecutionState message to the JM which switches the state of the task from deploying to running. This means that the handler of SubmitTask future in {{Execution}} won't change the state of the task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1924) [Py] Refactor a few minor things
[ https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-1924: Affects Version/s: (was: 0.8.2) 0.9 Fix Version/s: (was: 0.8.2) 0.9 [Py] Refactor a few minor things Key: FLINK-1924 URL: https://issues.apache.org/jira/browse/FLINK-1924 Project: Flink Issue Type: Improvement Components: Python API Affects Versions: 0.9 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Priority: Trivial Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1924) [Py] Refactor a few minor things
[ https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-1924: Affects Version/s: (was: 0.8.1) 0.8.2 Fix Version/s: (was: 0.9) 0.8.2 [Py] Refactor a few minor things Key: FLINK-1924 URL: https://issues.apache.org/jira/browse/FLINK-1924 Project: Flink Issue Type: Improvement Components: Python API Affects Versions: 0.8.2 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Priority: Trivial Fix For: 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1926) [Py] Sync Python API with other API's
[ https://issues.apache.org/jira/browse/FLINK-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-1926: Summary: [Py] Sync Python API with other API's (was: [Py] Sync Python API with otehr API's) [Py] Sync Python API with other API's - Key: FLINK-1926 URL: https://issues.apache.org/jira/browse/FLINK-1926 Project: Flink Issue Type: Improvement Components: Python API Affects Versions: 0.9 Reporter: Chesnay Schepler Meta issue to track overall sync status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1926) [Py] Sync Python API with otehr API's
Chesnay Schepler created FLINK-1926: --- Summary: [Py] Sync Python API with otehr API's Key: FLINK-1926 URL: https://issues.apache.org/jira/browse/FLINK-1926 Project: Flink Issue Type: Improvement Components: Python API Affects Versions: 0.9 Reporter: Chesnay Schepler Meta issue to track overall sync status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1927) [Py] Rework operator distribution
Chesnay Schepler created FLINK-1927: --- Summary: [Py] Rework operator distribution Key: FLINK-1927 URL: https://issues.apache.org/jira/browse/FLINK-1927 Project: Flink Issue Type: Improvement Components: Python API Affects Versions: 0.9 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Priority: Minor Fix For: 0.9 Currently, the python operator is created when execution the python plan file, serialized using dill and saved as a byte[] in the java function. It is then deserialized at runtime on each node. The current implementation is fairly hacky, and imposes certain limitations that make it hard to work with. Chaining, or generally saving other user-code, always requires a separate deserialization step after deserializing the operator. These issues can be easily circumvented by rebuilding the (python) plan on each node, instead of serializing the operator. The plan creation is deterministic, and every operator is uniquely identified by an ID that is already known to the java function. This change will allow us to easily support custom serializers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1916) EOFException when running delta-iteration job
[ https://issues.apache.org/jira/browse/FLINK-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506732#comment-14506732 ] Stefan Bunk commented on FLINK-1916: Isn't the code I linked already quite minimal? What do you need? EOFException when running delta-iteration job - Key: FLINK-1916 URL: https://issues.apache.org/jira/browse/FLINK-1916 Project: Flink Issue Type: Bug Environment: 0.9-milestone-1 Exception on the cluster, local execution works Reporter: Stefan Bunk The delta-iteration program in [1] ends with an java.io.EOFException at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.nextSegment(InMemoryPartition.java:333) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.advance(AbstractPagedOutputView.java:140) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeByte(AbstractPagedOutputView.java:223) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeLong(AbstractPagedOutputView.java:291) at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeDouble(AbstractPagedOutputView.java:307) at org.apache.flink.api.common.typeutils.base.DoubleSerializer.serialize(DoubleSerializer.java:62) at org.apache.flink.api.common.typeutils.base.DoubleSerializer.serialize(DoubleSerializer.java:26) at org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:89) at org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:29) at org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:219) at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536) at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217) at java.lang.Thread.run(Thread.java:745) For logs and the accompanying mailing list discussion see below. When running with slightly different memory configuration, as hinted on the mailing list, I sometimes also get this exception: 19.Apr. 13:39:29 INFO Task - IterationHead(WorksetIteration (Resolved-Redirects)) (10/10) switched to FAILED : java.lang.IndexOutOfBoundsException: Index: 161, Size: 161 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.resetTo(InMemoryPartition.java:352) at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.access$100(InMemoryPartition.java:301) at org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:226) at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536) at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217) at java.lang.Thread.run(Thread.java:745) [1] Flink job: https://gist.github.com/knub/0bfec859a563009c1d57 [2] Job manager logs: https://gist.github.com/knub/01e3a4b0edb8cde66ff4 [3] One task manager's logs: https://gist.github.com/knub/8f2f953da95c8d7adefc [4] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/EOFException-when-running-Flink-job-td1092.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1922) Failed task deployment causes NPE on input split assignment
Robert Metzger created FLINK-1922: - Summary: Failed task deployment causes NPE on input split assignment Key: FLINK-1922 URL: https://issues.apache.org/jira/browse/FLINK-1922 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.9 Reporter: Robert Metzger The input split assignment code is returning {null} if the Task has failed, which is causing a NPE. We should improve our error handling / reporting in that situation. {code} 13:12:31,002 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Status of job c0b47ce41e9a85a628a628a3977705ef (Flink Java Job at Tue Apr 21 13:10:36 UTC 2015) changed to FAILING Cannot deploy task - TaskManager not responding.. 13:12:47,591 ERROR org.apache.flink.runtime.operators.RegularPactTask - Error in task code: CHAIN DataSource (at userMethod (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at main(UserClass.java:111)) (20/50) java.lang.RuntimeException: Requesting the next InputSplit failed. at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88) at org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301) at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83) ... 4 more 13:12:47,595 INFO org.apache.flink.runtime.taskmanager.Task - CHAIN DataSource (at SomeMethod (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at main(SomeClass.java:111)) (20/50) switched to FAILED : java.lang.RuntimeException: Requesting the next InputSplit failed. at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88) at org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301) at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83) ... 4 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1891]Add the input storageDirectory emp...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/601 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1891) Add isEmpty check when the input dir
[ https://issues.apache.org/jira/browse/FLINK-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506825#comment-14506825 ] ASF GitHub Bot commented on FLINK-1891: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/601 Add isEmpty check when the input dir Key: FLINK-1891 URL: https://issues.apache.org/jira/browse/FLINK-1891 Project: Flink Issue Type: Bug Components: Local Runtime Affects Versions: master Reporter: Sibao Hong Assignee: Sibao Hong Add the input storageDirectory empty check, if input of storageDirectory is empty, we should use tmp as the base dir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-703] Use complete element as join key
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/572 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-703) Use complete element as join key.
[ https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506826#comment-14506826 ] ASF GitHub Bot commented on FLINK-703: -- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/572 Use complete element as join key. - Key: FLINK-703 URL: https://issues.apache.org/jira/browse/FLINK-703 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chiwan Park Priority: Trivial Labels: github-import Fix For: pre-apache In some situations such as semi-joins it could make sense to use a complete element as join key. Currently this can be done using a key-selector function, but we could offer a shortcut for that. This is not an urgent issue, but might be helpful. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/703 Created by: [fhueske|https://github.com/fhueske] Labels: enhancement, java api, user satisfaction, Milestone: Release 0.6 (unplanned) Created at: Thu Apr 17 23:40:00 CEST 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-703) Use complete element as join key.
[ https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-703: Fix Version/s: (was: pre-apache) 0.9 Use complete element as join key. - Key: FLINK-703 URL: https://issues.apache.org/jira/browse/FLINK-703 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chiwan Park Priority: Trivial Labels: github-import Fix For: 0.9 In some situations such as semi-joins it could make sense to use a complete element as join key. Currently this can be done using a key-selector function, but we could offer a shortcut for that. This is not an urgent issue, but might be helpful. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/703 Created by: [fhueske|https://github.com/fhueske] Labels: enhancement, java api, user satisfaction, Milestone: Release 0.6 (unplanned) Created at: Thu Apr 17 23:40:00 CEST 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-703) Use complete element as join key.
[ https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-703. - Resolution: Implemented Implemented with 45e680c2b6c9c2f64ce55423b755a13d402ff8ba Use complete element as join key. - Key: FLINK-703 URL: https://issues.apache.org/jira/browse/FLINK-703 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chiwan Park Priority: Trivial Labels: github-import Fix For: pre-apache In some situations such as semi-joins it could make sense to use a complete element as join key. Currently this can be done using a key-selector function, but we could offer a shortcut for that. This is not an urgent issue, but might be helpful. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/703 Created by: [fhueske|https://github.com/fhueske] Labels: enhancement, java api, user satisfaction, Milestone: Release 0.6 (unplanned) Created at: Thu Apr 17 23:40:00 CEST 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1891) Add isEmpty check when the input dir
[ https://issues.apache.org/jira/browse/FLINK-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-1891. -- Resolution: Fixed Fix Version/s: 0.9 Fixed with a0147c493cf210a0914c35200ebfacd47515374d Add isEmpty check when the input dir Key: FLINK-1891 URL: https://issues.apache.org/jira/browse/FLINK-1891 Project: Flink Issue Type: Bug Components: Local Runtime Affects Versions: master Reporter: Sibao Hong Assignee: Sibao Hong Fix For: 0.9 Add the input storageDirectory empty check, if input of storageDirectory is empty, we should use tmp as the base dir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/372 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506913#comment-14506913 ] ASF GitHub Bot commented on FLINK-1486: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/372 Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1804) flink-quickstart-scala tests fail on scala-2.11 build profile on travis
[ https://issues.apache.org/jira/browse/FLINK-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506691#comment-14506691 ] Robert Metzger commented on FLINK-1804: --- I didn't see them lately. I'll close the issue for now. flink-quickstart-scala tests fail on scala-2.11 build profile on travis --- Key: FLINK-1804 URL: https://issues.apache.org/jira/browse/FLINK-1804 Project: Flink Issue Type: Task Components: Build System, Quickstarts Affects Versions: 0.9 Reporter: Robert Metzger Travis builds on master started failing after the Scala 2.11 profile has been added to Flink. For example: https://travis-ci.org/apache/flink/jobs/56312734 The error: {code} [INFO] [INFO] --- scala-maven-plugin:3.1.4:compile (default) @ testArtifact --- [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype-apache [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype-apache [INFO] [WARNING] Expected all dependencies to require Scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill-avro_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill-bijection_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:bijection-core_2.10:0.7.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:bijection-avro_2.10:0.7.2 requires scala version: 2.10.4 [INFO] [WARNING] org.scala-lang:scala-reflect:2.10.4 requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala version: 2.10.4 [INFO] [WARNING] org.scala-lang:scala-compiler:2.10.4 requires scala version: 2.10.4 [INFO] [WARNING] org.scalamacros:quasiquotes_2.10:2.0.1 requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-streaming-scala:0.9-SNAPSHOT requires scala version: 2.11.4 [INFO] [WARNING] Multiple versions of scala libraries detected! [INFO] [INFO] /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala:-1: info: compiling [INFO] [INFO] Compiling 3 source files to /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes at 1427650524446 [INFO] [ERROR] error: [INFO] [INFO] while compiling: /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala/org/apache/flink/archetypetest/SocketTextStreamWordCount.scala [INFO] [INFO] during phase: typer [INFO] [INFO] library version: version 2.10.4 [INFO] [INFO] compiler version: version 2.10.4 [INFO] [INFO] reconstructed args: -d /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes -classpath
[jira] [Resolved] (FLINK-1804) flink-quickstart-scala tests fail on scala-2.11 build profile on travis
[ https://issues.apache.org/jira/browse/FLINK-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1804. --- Resolution: Invalid flink-quickstart-scala tests fail on scala-2.11 build profile on travis --- Key: FLINK-1804 URL: https://issues.apache.org/jira/browse/FLINK-1804 Project: Flink Issue Type: Task Components: Build System, Quickstarts Affects Versions: 0.9 Reporter: Robert Metzger Travis builds on master started failing after the Scala 2.11 profile has been added to Flink. For example: https://travis-ci.org/apache/flink/jobs/56312734 The error: {code} [INFO] [INFO] --- scala-maven-plugin:3.1.4:compile (default) @ testArtifact --- [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype-apache [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype-apache [INFO] [WARNING] Expected all dependencies to require Scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill-avro_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill-bijection_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:bijection-core_2.10:0.7.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:bijection-avro_2.10:0.7.2 requires scala version: 2.10.4 [INFO] [WARNING] org.scala-lang:scala-reflect:2.10.4 requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala version: 2.10.4 [INFO] [WARNING] org.scala-lang:scala-compiler:2.10.4 requires scala version: 2.10.4 [INFO] [WARNING] org.scalamacros:quasiquotes_2.10:2.0.1 requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-streaming-scala:0.9-SNAPSHOT requires scala version: 2.11.4 [INFO] [WARNING] Multiple versions of scala libraries detected! [INFO] [INFO] /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala:-1: info: compiling [INFO] [INFO] Compiling 3 source files to /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes at 1427650524446 [INFO] [ERROR] error: [INFO] [INFO] while compiling: /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala/org/apache/flink/archetypetest/SocketTextStreamWordCount.scala [INFO] [INFO] during phase: typer [INFO] [INFO] library version: version 2.10.4 [INFO] [INFO] compiler version: version 2.10.4 [INFO] [INFO] reconstructed args: -d /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes -classpath
[jira] [Created] (FLINK-1920) Passing -D akka.ask.timeout=5 min to yarn client does not work
Robert Metzger created FLINK-1920: - Summary: Passing -D akka.ask.timeout=5 min to yarn client does not work Key: FLINK-1920 URL: https://issues.apache.org/jira/browse/FLINK-1920 Project: Flink Issue Type: Bug Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Thats probably an issue of the command line parsing. Variations like -D akka.ask.timeout=5 min or -D akka.ask.timeout=5 min are also not working. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1921) Rework parallelism/slots handling for per-job YARN sessions
Robert Metzger created FLINK-1921: - Summary: Rework parallelism/slots handling for per-job YARN sessions Key: FLINK-1921 URL: https://issues.apache.org/jira/browse/FLINK-1921 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Robert Metzger Right now, the -p argument is overwriting the -ys argument for per job yarn sessions. Also, the priorities for parallelism should be documented: low to high 1. flink-conf.yaml (-D arguments on YARN) 2. -p on ./bin/flink 3. ExecutionEnvironment.setParallelism() 4. Operator.setParallelism(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95153627 I think you are right. If there's only one sink active, there is no need for a sink identifier. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506968#comment-14506968 ] ASF GitHub Bot commented on FLINK-1486: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95153627 I think you are right. If there's only one sink active, there is no need for a sink identifier. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1878) Add mode to Environments to deactivate sysout printing
[ https://issues.apache.org/jira/browse/FLINK-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506978#comment-14506978 ] Stephan Ewen commented on FLINK-1878: - Complemented in b70431239a5e18555866addb41ee6edf2b79ff60 Add mode to Environments to deactivate sysout printing -- Key: FLINK-1878 URL: https://issues.apache.org/jira/browse/FLINK-1878 Project: Flink Issue Type: Bug Components: Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The test output is currently spoiled for debugging by all the sysout output from the RemoteEnvironment-based tests The execution environment should offer a mode to activate/deactivate printing to System.out - for tests, we should deactivate this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment
[ https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1918. - Resolution: Fixed Fix Version/s: 0.9 Assignee: Stephan Ewen Fixed via 2b8db40ac40d70027ce331f3a04c6ca7aa562a84 NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment - Key: FLINK-1918 URL: https://issues.apache.org/jira/browse/FLINK-1918 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Zoltán Zvara Assignee: Stephan Ewen Labels: yarn, yarn-client Fix For: 0.9 Trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.client.program.Client.init(Client.java:104) at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86) at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82) at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70) at Wordcount.main(Wordcount.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) {code} The constructor is trying to set configuration parameter {{jobmanager.rpc.address}} with {{jobManagerAddress.getAddress().getHostAddress()}}, but {{jobManagerAddress.holder.addr}} is {{null}}. {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment
[ https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1918. --- NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment - Key: FLINK-1918 URL: https://issues.apache.org/jira/browse/FLINK-1918 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Zoltán Zvara Assignee: Stephan Ewen Labels: yarn, yarn-client Fix For: 0.9 Trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.client.program.Client.init(Client.java:104) at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86) at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82) at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70) at Wordcount.main(Wordcount.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) {code} The constructor is trying to set configuration parameter {{jobmanager.rpc.address}} with {{jobManagerAddress.getAddress().getHostAddress()}}, but {{jobManagerAddress.holder.addr}} is {{null}}. {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Periodic full stream aggregates added + partit...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/614#issuecomment-95156797 LGTM, really nice feature. Maybe we should also discourage or even diasble the current full stream reduces where the state might grow infinite now that we have this option. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment
[ https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506983#comment-14506983 ] Stephan Ewen commented on FLINK-1918: - [~Ehnalis] It should be fixed in the the latest master- You can compile your own, or wait a few hours until Travis/Apache have synced the maven repositories with the new version. NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment - Key: FLINK-1918 URL: https://issues.apache.org/jira/browse/FLINK-1918 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Zoltán Zvara Assignee: Stephan Ewen Labels: yarn, yarn-client Fix For: 0.9 Trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.client.program.Client.init(Client.java:104) at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86) at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82) at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70) at Wordcount.main(Wordcount.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) {code} The constructor is trying to set configuration parameter {{jobmanager.rpc.address}} with {{jobManagerAddress.getAddress().getHostAddress()}}, but {{jobManagerAddress.holder.addr}} is {{null}}. {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/610#issuecomment-95157189 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo
[ https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506986#comment-14506986 ] ASF GitHub Bot commented on FLINK-1909: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/610#issuecomment-95157189 LGTM. Refactor streaming scala api to use returns for adding typeinfo --- Key: FLINK-1909 URL: https://issues.apache.org/jira/browse/FLINK-1909 Project: Flink Issue Type: Improvement Components: Scala API, Streaming Reporter: Gyula Fora Assignee: Gyula Fora Currently the streaming scala api uses transform to pass the extracted type information instead of .returns. This leads to a lot of code duplication. -- This message was sent by Atlassian JIRA (v6.3.4#6332)