[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library

2015-04-22 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507333#comment-14507333
 ] 

Till Rohrmann commented on FLINK-1745:
--

Hi Raghav,

if I understood it correctly, then approach 1 and 2 are implementing the same 
approximate kNN algorithm. The difference is only that the first paper 
implements it on MapReduce and the latter paper on a relational database.

I personally think that we should add eventually an approximate kNN 
implementation to the ML library because we want to scale to large amounts of 
data. The exact implementations can act as good baseline method, though.

The problem with the zkNN IMHO is to calculate the z-value for double based 
feature vectors. There is another paper 
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4118757 which 
implements a different approximation algorithm for kNN. This might be an 
alternative to zkNN. Or at least it can act as good comparison for zkNN.

Maybe each one of you [~raghav.chalapa...@gmail.com] [~chiwanpark] picks one 
algorithm to implement and then we give the user of the ML library the choice 
to select what suits him best. What do you think?

 Add k-nearest-neighbours algorithm to machine learning library
 --

 Key: FLINK-1745
 URL: https://issues.apache.org/jira/browse/FLINK-1745
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Chiwan Park
  Labels: ML, Starter

 Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial 
 it is still used as a mean to classify data and to do regression.
 Could be a starter task.
 Resources:
 [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
 [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLINK-1865) Unstable test KafkaITCase

2015-04-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi updated FLINK-1865:
--
Comment: was deleted

(was: Go ahead.)

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Márton Balassi

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 

[jira] [Issue Comment Deleted] (FLINK-1865) Unstable test KafkaITCase

2015-04-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi updated FLINK-1865:
--
Comment: was deleted

(was: Go ahead.)

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Márton Balassi

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 

[jira] [Created] (FLINK-1931) Add ADMM to optimization framework

2015-04-22 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1931:


 Summary: Add ADMM to optimization framework
 Key: FLINK-1931
 URL: https://issues.apache.org/jira/browse/FLINK-1931
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann


With the upcoming optimization framework soon in place we can think about 
adding more optimization algorithms to it. One new addition could be the 
alternating direction method of multipliers (ADMM) [1, 2].

Resources
[1] http://stanford.edu/~boyd/admm.html
[2] http://en.wikipedia.org/wiki/Augmented_Lagrangian_method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1931) Add ADMM to optimization framework

2015-04-22 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-1931:
-
Labels: ML  (was: )

 Add ADMM to optimization framework
 --

 Key: FLINK-1931
 URL: https://issues.apache.org/jira/browse/FLINK-1931
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
  Labels: ML

 With the upcoming optimization framework soon in place we can think about 
 adding more optimization algorithms to it. One new addition could be the 
 alternating direction method of multipliers (ADMM) [1, 2].
 Resources
 [1] http://stanford.edu/~boyd/admm.html
 [2] http://en.wikipedia.org/wiki/Augmented_Lagrangian_method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507004#comment-14507004
 ] 

ASF GitHub Bot commented on FLINK-1909:
---

Github user gyfora commented on a diff in the pull request:

https://github.com/apache/flink/pull/610#discussion_r28868351
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java
 ---
@@ -632,15 +608,18 @@ public void registerType(Class? type) {
 * @return the data stream constructed
 */
@SuppressWarnings(unchecked)
-   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function,
-   TypeInformationOUT outTypeInfo, String sourceName) {
+   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function, String sourceName) {
+
+   TypeInformationOUT outTypeInfo;
 
-   if (outTypeInfo == null) {
-   if (function instanceof GenericSourceFunction) {
-   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
-   } else {
+   if (function instanceof GenericSourceFunction) {
+   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
+   } else {
+   try {
outTypeInfo = 
TypeExtractor.createTypeInfo(SourceFunction.class,
function.getClass(), 0, null, 
null);
+   } catch (InvalidTypesException e) {
+   outTypeInfo = (TypeInformationOUT) new 
MissingTypeInfo(Custom source, e);
--- End diff --

This is the same mechanism that all flink operators use to allow missing 
types that can be filled with .returns(..) afterwards.


 Refactor streaming scala api to use returns for adding typeinfo
 ---

 Key: FLINK-1909
 URL: https://issues.apache.org/jira/browse/FLINK-1909
 Project: Flink
  Issue Type: Improvement
  Components: Scala API, Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora

 Currently the streaming scala api uses transform to pass the extracted type 
 information instead of .returns. This leads to a lot of code duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase

2015-04-22 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507058#comment-14507058
 ] 

Robert Metzger commented on FLINK-1865:
---

The issue still persists (this is a build from today's master): 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/59538212/log.txt

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Márton Balassi

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
 

[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase

2015-04-22 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507069#comment-14507069
 ] 

Robert Metzger commented on FLINK-1865:
---

I have written to the Kafka Mailing list to ask them regarding our 
PersistentKafkaSource stuff to resolve all the Kafka issues: 
http://mail-archives.apache.org/mod_mbox/kafka-users/201504.mbox/%3CCAGr9p8Dvx1OKo0q4iTtR7ped7rxgcaHKpbndo3imuJzLvuG03Q%40mail.gmail.com%3E
When we find a way to use the high level consumer with manual offset control 
this issue will certainly be resolved.

[~bamrabi]: is it okay for you when I'm assigning the issue to myself?

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Márton Balassi

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 

[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507012#comment-14507012
 ] 

ASF GitHub Bot commented on FLINK-1909:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/610#discussion_r28868831
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java
 ---
@@ -632,15 +608,18 @@ public void registerType(Class? type) {
 * @return the data stream constructed
 */
@SuppressWarnings(unchecked)
-   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function,
-   TypeInformationOUT outTypeInfo, String sourceName) {
+   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function, String sourceName) {
+
+   TypeInformationOUT outTypeInfo;
 
-   if (outTypeInfo == null) {
-   if (function instanceof GenericSourceFunction) {
-   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
-   } else {
+   if (function instanceof GenericSourceFunction) {
+   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
+   } else {
+   try {
outTypeInfo = 
TypeExtractor.createTypeInfo(SourceFunction.class,
function.getClass(), 0, null, 
null);
+   } catch (InvalidTypesException e) {
+   outTypeInfo = (TypeInformationOUT) new 
MissingTypeInfo(Custom source, e);
--- End diff --

Okay, I see


 Refactor streaming scala api to use returns for adding typeinfo
 ---

 Key: FLINK-1909
 URL: https://issues.apache.org/jira/browse/FLINK-1909
 Project: Flink
  Issue Type: Improvement
  Components: Scala API, Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora

 Currently the streaming scala api uses transform to pass the extracted type 
 information instead of .returns. This leads to a lot of code duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase

2015-04-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507361#comment-14507361
 ] 

Márton Balassi commented on FLINK-1865:
---

Go ahead.

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Márton Balassi

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
  

[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase

2015-04-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507359#comment-14507359
 ] 

Márton Balassi commented on FLINK-1865:
---

Go ahead.

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Márton Balassi

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
  

[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase

2015-04-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507360#comment-14507360
 ] 

Márton Balassi commented on FLINK-1865:
---

Go ahead.

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Márton Balassi

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
  

[jira] [Created] (FLINK-1932) Add L-BFGS to the optimization framework

2015-04-22 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1932:


 Summary: Add L-BFGS to the optimization framework
 Key: FLINK-1932
 URL: https://issues.apache.org/jira/browse/FLINK-1932
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann


A good candidate to add to the new optimization framework could be L-BFGS [1, 
2].

Resources:
[1] http://papers.nips.cc/paper/5333-large-scale-l-bfgs-using-mapreduce.pdf
[2] http://en.wikipedia.org/wiki/Limited-memory_BFGS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1422) Missing usage example for withParameters

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507190#comment-14507190
 ] 

ASF GitHub Bot commented on FLINK-1422:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/350


 Missing usage example for withParameters
 --

 Key: FLINK-1422
 URL: https://issues.apache.org/jira/browse/FLINK-1422
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.8.0
Reporter: Alexander Alexandrov
Priority: Trivial
 Fix For: 0.8.2

   Original Estimate: 1h
  Remaining Estimate: 1h

 I am struggling to find a usage example of the withParameters method in the 
 documentation. At the moment I only see this note:
 {quote}
 Note: As the content of broadcast variables is kept in-memory on each node, 
 it should not become too large. For simpler things like scalar values you can 
 simply make parameters part of the closure of a function, or use the 
 withParameters(...) method to pass in a configuration.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration

2015-04-22 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507189#comment-14507189
 ] 

Vasia Kalavri commented on FLINK-1930:
--

Hi [~rmetzger]! No, I'm not sure what's the root cause. 
I bumped into this when running ~1 month old code (which used to run fine), so 
it must be something recently introduced.

 NullPointerException in vertex-centric iteration
 

 Key: FLINK-1930
 URL: https://issues.apache.org/jira/browse/FLINK-1930
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Vasia Kalavri

 Hello to my Squirrels,
 I came across this exception when having a vertex-centric iteration output 
 followed by a group by. 
 I'm not sure if what is causing it, since I saw this error in a rather large 
 pipeline, but I managed to reproduce it with [this code example | 
 https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309]
  and a sufficiently large dataset, e.g. [this one | 
 http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally).
 It seems like a null Buffer in RecordWriter.
 The exception message is the following:
 Exception in thread main 
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
 at 
 org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
 at akka.dispatch.Mailbox.run(Mailbox.scala:221)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
 at 
 org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
 at 
 org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1422] Add withParameters() to documenta...

2015-04-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/350


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Periodic full stream aggregates added + partit...

2015-04-22 Thread gyfora
Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/614#issuecomment-95157387
  
The current stream reduces only keep 1 element as their state, the problem 
is that if you run it in parallel, each node will keep a partial reduce.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507002#comment-14507002
 ] 

ASF GitHub Bot commented on FLINK-1909:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/610#discussion_r28868234
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java
 ---
@@ -632,15 +608,18 @@ public void registerType(Class? type) {
 * @return the data stream constructed
 */
@SuppressWarnings(unchecked)
-   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function,
-   TypeInformationOUT outTypeInfo, String sourceName) {
+   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function, String sourceName) {
+
+   TypeInformationOUT outTypeInfo;
 
-   if (outTypeInfo == null) {
-   if (function instanceof GenericSourceFunction) {
-   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
-   } else {
+   if (function instanceof GenericSourceFunction) {
+   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
+   } else {
+   try {
outTypeInfo = 
TypeExtractor.createTypeInfo(SourceFunction.class,
function.getClass(), 0, null, 
null);
+   } catch (InvalidTypesException e) {
+   outTypeInfo = (TypeInformationOUT) new 
MissingTypeInfo(Custom source, e);
--- End diff --

Why did you change this?
I suspect the exception will now be thrown somewhere else?


 Refactor streaming scala api to use returns for adding typeinfo
 ---

 Key: FLINK-1909
 URL: https://issues.apache.org/jira/browse/FLINK-1909
 Project: Flink
  Issue Type: Improvement
  Components: Scala API, Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora

 Currently the streaming scala api uses transform to pass the extracted type 
 information instead of .returns. This leads to a lot of code duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...

2015-04-22 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/610#discussion_r28868831
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java
 ---
@@ -632,15 +608,18 @@ public void registerType(Class? type) {
 * @return the data stream constructed
 */
@SuppressWarnings(unchecked)
-   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function,
-   TypeInformationOUT outTypeInfo, String sourceName) {
+   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function, String sourceName) {
+
+   TypeInformationOUT outTypeInfo;
 
-   if (outTypeInfo == null) {
-   if (function instanceof GenericSourceFunction) {
-   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
-   } else {
+   if (function instanceof GenericSourceFunction) {
+   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
+   } else {
+   try {
outTypeInfo = 
TypeExtractor.createTypeInfo(SourceFunction.class,
function.getClass(), 0, null, 
null);
+   } catch (InvalidTypesException e) {
+   outTypeInfo = (TypeInformationOUT) new 
MissingTypeInfo(Custom source, e);
--- End diff --

Okay, I see


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...

2015-04-22 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/617#issuecomment-95173776
  
No, I would not add a dependency there. Maybe a comment that we've managed 
a transitive dependency of hbase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...

2015-04-22 Thread gyfora
Github user gyfora closed the pull request at:

https://github.com/apache/flink/pull/610


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507131#comment-14507131
 ] 

ASF GitHub Bot commented on FLINK-1909:
---

Github user gyfora closed the pull request at:

https://github.com/apache/flink/pull/610


 Refactor streaming scala api to use returns for adding typeinfo
 ---

 Key: FLINK-1909
 URL: https://issues.apache.org/jira/browse/FLINK-1909
 Project: Flink
  Issue Type: Improvement
  Components: Scala API, Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora

 Currently the streaming scala api uses transform to pass the extracted type 
 information instead of .returns. This leads to a lot of code duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1422] Add withParameters() to documenta...

2015-04-22 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/350#issuecomment-95206967
  
Thanks, Chesnay! I've added the constructor parametrization, fixed the code 
examples, and rebased on the lastest master.

Going to merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...

2015-04-22 Thread uce
GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/617

[runtime] Bump Netty version to 4.27.Final and add javassist

@rmetzger, javassist is set in the root POM. Is it OK to leave it in 
flink-runtime as I have it now w/o version?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/incubator-flink javassist

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/617.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #617


commit 6a91d722005d7f8d138e8553f92a134c52209de8
Author: Ufuk Celebi u...@apache.org
Date:   2015-04-22T12:50:38Z

[runtime] Bump Netty version to 4.27.Final and add javassist




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1875] Add figure explaining slots and p...

2015-04-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/604


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1875) Add figure to documentation describing slots and parallelism

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507080#comment-14507080
 ] 

ASF GitHub Bot commented on FLINK-1875:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/604


 Add figure to documentation describing slots and parallelism
 

 Key: FLINK-1875
 URL: https://issues.apache.org/jira/browse/FLINK-1875
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 Our users are still confused how parallelism and slots are connected to each 
 other.
 We tried addressing this issue already with FLINK-1679, but I think we also 
 need to have a nice picture in our documentation.
 This is too complicated: 
 http://ci.apache.org/projects/flink/flink-docs-master/internal_job_scheduling.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [docs] Change doc layout

2015-04-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/606


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...

2015-04-22 Thread gyfora
Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/521#issuecomment-95199623
  
Hey Peti, 
I added a refactor commit that reworked the way types are handled in the 
sources a little bit. 

https://github.com/apache/flink/commit/6df1dd2cc848d0a691a98309a3bb760f9a998673
Can you please rebase on that one?

It should make things nicer.

Gyula


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507148#comment-14507148
 ] 

ASF GitHub Bot commented on FLINK-1687:
---

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/521#issuecomment-95199623
  
Hey Peti, 
I added a refactor commit that reworked the way types are handled in the 
sources a little bit. 

https://github.com/apache/flink/commit/6df1dd2cc848d0a691a98309a3bb760f9a998673
Can you please rebase on that one?

It should make things nicer.

Gyula


 Streaming file source/sink API is not in sync with the batch API
 

 Key: FLINK-1687
 URL: https://issues.apache.org/jira/browse/FLINK-1687
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gábor Hermann
Assignee: Péter Szabó

 Streaming environment is missing file inputs like readFile, readCsvFile and 
 also the more general createInput function, and outputs like writeAsCsv and 
 write. Streaming and batch API should be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...

2015-04-22 Thread fpompermaier
Github user fpompermaier commented on a diff in the pull request:

https://github.com/apache/flink/pull/571#discussion_r28869610
  
--- Diff: flink-staging/flink-hbase/pom.xml ---
@@ -112,6 +112,12 @@ under the License.
/exclusion
/exclusions
/dependency
+   dependency
--- End diff --

Unfortunately the TableInputFormat and TableOutputFormat are in the server 
jar.
For the read we've reimplemented it to make it more robust so we don't need 
that jar, but for the output it is indeed required.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507025#comment-14507025
 ] 

ASF GitHub Bot commented on FLINK-1828:
---

Github user fpompermaier commented on a diff in the pull request:

https://github.com/apache/flink/pull/571#discussion_r28869610
  
--- Diff: flink-staging/flink-hbase/pom.xml ---
@@ -112,6 +112,12 @@ under the License.
/exclusion
/exclusions
/dependency
+   dependency
--- End diff --

Unfortunately the TableInputFormat and TableOutputFormat are in the server 
jar.
For the read we've reimplemented it to make it more robust so we don't need 
that jar, but for the output it is indeed required.


 Impossible to output data to an HBase table
 ---

 Key: FLINK-1828
 URL: https://issues.apache.org/jira/browse/FLINK-1828
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9
Reporter: Flavio Pompermaier
  Labels: hadoop, hbase
 Fix For: 0.9


 Right now it is not possible to use HBase TableOutputFormat as output format 
 because Configurable.setConf  is not called in the configure() method of the 
 HadoopOutputFormatBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...

2015-04-22 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/610#discussion_r28868234
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java
 ---
@@ -632,15 +608,18 @@ public void registerType(Class? type) {
 * @return the data stream constructed
 */
@SuppressWarnings(unchecked)
-   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function,
-   TypeInformationOUT outTypeInfo, String sourceName) {
+   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function, String sourceName) {
+
+   TypeInformationOUT outTypeInfo;
 
-   if (outTypeInfo == null) {
-   if (function instanceof GenericSourceFunction) {
-   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
-   } else {
+   if (function instanceof GenericSourceFunction) {
+   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
+   } else {
+   try {
outTypeInfo = 
TypeExtractor.createTypeInfo(SourceFunction.class,
function.getClass(), 0, null, 
null);
+   } catch (InvalidTypesException e) {
+   outTypeInfo = (TypeInformationOUT) new 
MissingTypeInfo(Custom source, e);
--- End diff --

Why did you change this?
I suspect the exception will now be thrown somewhere else?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...

2015-04-22 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/617#issuecomment-95163295
  
Ah .. we didn't change the javaassist version ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...

2015-04-22 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/617#issuecomment-95169398
  
Yap. I've just seen that there is no reference to javassist in the 
flink-hbase pom. Is that intentional? Shouldn't we also add a dependency there 
w/o the version?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...

2015-04-22 Thread gyfora
Github user gyfora commented on a diff in the pull request:

https://github.com/apache/flink/pull/610#discussion_r28868351
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java
 ---
@@ -632,15 +608,18 @@ public void registerType(Class? type) {
 * @return the data stream constructed
 */
@SuppressWarnings(unchecked)
-   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function,
-   TypeInformationOUT outTypeInfo, String sourceName) {
+   private OUT DataStreamSourceOUT addSource(SourceFunctionOUT 
function, String sourceName) {
+
+   TypeInformationOUT outTypeInfo;
 
-   if (outTypeInfo == null) {
-   if (function instanceof GenericSourceFunction) {
-   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
-   } else {
+   if (function instanceof GenericSourceFunction) {
+   outTypeInfo = ((GenericSourceFunctionOUT) 
function).getType();
+   } else {
+   try {
outTypeInfo = 
TypeExtractor.createTypeInfo(SourceFunction.class,
function.getClass(), 0, null, 
null);
+   } catch (InvalidTypesException e) {
+   outTypeInfo = (TypeInformationOUT) new 
MissingTypeInfo(Custom source, e);
--- End diff --

This is the same mechanism that all flink operators use to allow missing 
types that can be filled with .returns(..) afterwards.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1929) Add code to cleanly stop a running streaming topology

2015-04-22 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1929:
-

 Summary: Add code to cleanly stop a running streaming topology
 Key: FLINK-1929
 URL: https://issues.apache.org/jira/browse/FLINK-1929
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 0.9
Reporter: Robert Metzger


Right now its not possible to cleanly stop a running Streaming topology.

Cancelling the job will cancel all operators, but for proper exactly once 
processing from Kafka sources, we need to provide a way to stop the sources 
first, wait until all remaining tuples have been processed and then shut down 
the sources (so that they can commit the right offset to Zookeeper).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[GitHub] flink pull request: [docs] Change doc layout

2015-04-22 Thread ktzoumas
Github user ktzoumas commented on the pull request:

https://github.com/apache/flink/pull/606#issuecomment-95174641
  
Good to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo

2015-04-22 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora resolved FLINK-1909.
---
Resolution: Fixed

https://github.com/apache/flink/commit/6df1dd2cc848d0a691a98309a3bb760f9a998673

 Refactor streaming scala api to use returns for adding typeinfo
 ---

 Key: FLINK-1909
 URL: https://issues.apache.org/jira/browse/FLINK-1909
 Project: Flink
  Issue Type: Improvement
  Components: Scala API, Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora

 Currently the streaming scala api uses transform to pass the extracted type 
 information instead of .returns. This leads to a lot of code duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1422) Missing usage example for withParameters

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507163#comment-14507163
 ] 

ASF GitHub Bot commented on FLINK-1422:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/350#issuecomment-95206967
  
Thanks, Chesnay! I've added the constructor parametrization, fixed the code 
examples, and rebased on the lastest master.

Going to merge it.


 Missing usage example for withParameters
 --

 Key: FLINK-1422
 URL: https://issues.apache.org/jira/browse/FLINK-1422
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.8.0
Reporter: Alexander Alexandrov
Priority: Trivial
 Fix For: 0.8.2

   Original Estimate: 1h
  Remaining Estimate: 1h

 I am struggling to find a usage example of the withParameters method in the 
 documentation. At the moment I only see this note:
 {quote}
 Note: As the content of broadcast variables is kept in-memory on each node, 
 it should not become too large. For simpler things like scalar values you can 
 simply make parameters part of the closure of a function, or use the 
 withParameters(...) method to pass in a configuration.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1930) NullPointerException in vertex-centric iteration

2015-04-22 Thread Vasia Kalavri (JIRA)
Vasia Kalavri created FLINK-1930:


 Summary: NullPointerException in vertex-centric iteration
 Key: FLINK-1930
 URL: https://issues.apache.org/jira/browse/FLINK-1930
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Vasia Kalavri


Hello to my Squirrels,

I came across this exception when having a vertex-centric iteration output 
followed by a group by. 
I'm not sure if what is causing it, since I saw this error in a rather large 
pipeline, but I managed to reproduce it with [this code example | 
https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] 
and a sufficiently large dataset, e.g. [this one | 
http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally).
It seems like a null Buffer in RecordWriter.

The exception message is the following:

Exception in thread main 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37)
at 
org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at 
org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at 
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405)
at 
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365)
at 
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration

2015-04-22 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507179#comment-14507179
 ] 

Robert Metzger commented on FLINK-1930:
---

I've seen this issue also recently in a streaming job.

Are you sure that this issue is the root cause? Or has the job errored before?

 NullPointerException in vertex-centric iteration
 

 Key: FLINK-1930
 URL: https://issues.apache.org/jira/browse/FLINK-1930
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Vasia Kalavri

 Hello to my Squirrels,
 I came across this exception when having a vertex-centric iteration output 
 followed by a group by. 
 I'm not sure if what is causing it, since I saw this error in a rather large 
 pipeline, but I managed to reproduce it with [this code example | 
 https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309]
  and a sufficiently large dataset, e.g. [this one | 
 http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally).
 It seems like a null Buffer in RecordWriter.
 The exception message is the following:
 Exception in thread main 
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
 at 
 org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
 at akka.dispatch.Mailbox.run(Mailbox.scala:221)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
 at 
 org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
 at 
 org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-04-22 Thread Nikolaas Steenbergen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507306#comment-14507306
 ] 

Nikolaas Steenbergen commented on FLINK-1907:
-

I've added a class to start up the shell from within flink 
(org.apache.flink.api.scala.FlinkShell), So it can be started from the Flink 
jar directly.

Simply writing out the compiled lines of the wordcount example and then putting 
them in a jar (jar cf output.jar inputDir ) and uploading the jar to a local 
cluster (sh start-local.sh) via the webclient leads to:

org.apache.flink.client.program.ProgramInvocationException: Neither a 
'Main-Class', nor a 'program-class' entry was found in the jar file.
[..]

Do you have a suggestion what the easiest way is to put the single compiled 
shell commands in a format that is executable?

 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...

2015-04-22 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/617#issuecomment-95218150
  
OK, then this is good to merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Improve error messages on Task deployment

2015-04-22 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/615#issuecomment-95223394
  
+1

LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration

2015-04-22 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507290#comment-14507290
 ] 

Stephan Ewen commented on FLINK-1930:
-

I have a bit of code pending that may help to figure out whether this is a 
side-effect of cancelling, or a bug in the buffer pools.

I hope I can commit that soon...

 NullPointerException in vertex-centric iteration
 

 Key: FLINK-1930
 URL: https://issues.apache.org/jira/browse/FLINK-1930
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Vasia Kalavri

 Hello to my Squirrels,
 I came across this exception when having a vertex-centric iteration output 
 followed by a group by. 
 I'm not sure if what is causing it, since I saw this error in a rather large 
 pipeline, but I managed to reproduce it with [this code example | 
 https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309]
  and a sufficiently large dataset, e.g. [this one | 
 http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally).
 It seems like a null Buffer in RecordWriter.
 The exception message is the following:
 Exception in thread main 
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
 at 
 org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
 at akka.dispatch.Mailbox.run(Mailbox.scala:221)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
 at 
 org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
 at 
 org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library

2015-04-22 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507556#comment-14507556
 ] 

Chiwan Park commented on FLINK-1745:


Hi.

About the model, I think that grouping the neighbours with respect to each of 
the queried points is better.
Following code can be example.
{code}
val result: DataSet[Tuple2[Vector, Array[Vector]]] = model.transform(testingDS)
{code}

It's great idea that we give the user the choice to select best algorithm. I 
think it seems better to split this issue into two or more issues. (Exact kNN 
(BRJ), Approximate kNN (zkNN, kNN based hybrid spill tree), Distance Measure).

I hope to implement exact kNN and distance measure first, and try about 
approximate kNN.

 Add k-nearest-neighbours algorithm to machine learning library
 --

 Key: FLINK-1745
 URL: https://issues.apache.org/jira/browse/FLINK-1745
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Chiwan Park
  Labels: ML, Starter

 Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial 
 it is still used as a mean to classify data and to do regression.
 Could be a starter task.
 Resources:
 [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
 [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507675#comment-14507675
 ] 

ASF GitHub Bot commented on FLINK-1687:
---

Github user szape commented on the pull request:

https://github.com/apache/flink/pull/521#issuecomment-95303397
  
@gyfora 
Of, course, no problem. It will make the source API cleaner.

Peter


 Streaming file source/sink API is not in sync with the batch API
 

 Key: FLINK-1687
 URL: https://issues.apache.org/jira/browse/FLINK-1687
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gábor Hermann
Assignee: Péter Szabó

 Streaming environment is missing file inputs like readFile, readCsvFile and 
 also the more general createInput function, and outputs like writeAsCsv and 
 write. Streaming and batch API should be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1653) Setting up Apache Jenkins CI for continuous tests

2015-04-22 Thread Henry Saputra (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Saputra closed FLINK-1653.

Resolution: Fixed

With increase Travis capacity for ASF lets close this one for now.

 Setting up Apache Jenkins CI for continuous tests
 -

 Key: FLINK-1653
 URL: https://issues.apache.org/jira/browse/FLINK-1653
 Project: Flink
  Issue Type: Task
  Components: Build System
Reporter: Henry Saputra
Assignee: Henry Saputra
Priority: Minor

 We already have Travis build for Apache Flink Github mirror.
 This task is used to track effort to setup Flink Jenkins CI in ASF 
 environment [1]
 [1] https://builds.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library

2015-04-22 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508420#comment-14508420
 ] 

Chiwan Park commented on FLINK-1745:


Okay, thanks [~raghav.chalapa...@gmail.com] for concessions. I will create 
issue for distance measure and implement both distance measure and exact kNN 
implementation as soon as possible. :)

 Add k-nearest-neighbours algorithm to machine learning library
 --

 Key: FLINK-1745
 URL: https://issues.apache.org/jira/browse/FLINK-1745
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Chiwan Park
  Labels: ML, Starter

 Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial 
 it is still used as a mean to classify data and to do regression.
 Could be a starter task.
 Resources:
 [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
 [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Periodic full stream aggregates added + partit...

2015-04-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/614


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library

2015-04-22 Thread Raghav Chalapathy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508143#comment-14508143
 ] 

Raghav Chalapathy commented on FLINK-1745:
--

Hi 

I agree we with the idea to provide users the choice , Chiwan since you are 
hoping to taking up the exact KNN implementation
Shall I go ahead and work with other issues  and wait for your exact 
implementation and add approximate Knn later on ;

 I feel we need to provide the user  the results of  all the best possible 
algorithms and compare the results as given by H2o deep learning going forward 
so that user can see the performance based on the various algorithms picked

with regards
Raghav



 Add k-nearest-neighbours algorithm to machine learning library
 --

 Key: FLINK-1745
 URL: https://issues.apache.org/jira/browse/FLINK-1745
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Chiwan Park
  Labels: ML, Starter

 Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial 
 it is still used as a mean to classify data and to do regression.
 Could be a starter task.
 Resources:
 [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
 [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Add DEPRECATED annotations in Spargel APIs and...

2015-04-22 Thread hsaputra
GitHub user hsaputra opened a pull request:

https://github.com/apache/flink/pull/618

Add DEPRECATED annotations in Spargel APIs and update the doc to refer to 
Gelly for Flink graph processing.

Mark all user-facing from the Spargel API as deprecated and add a comment 
in the docs and point people to Gelly.

This should help migration to Gelly in the next release.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsaputra/flink 
FLINK-1693_deprecate_spargel_apis

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/618.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #618


commit 6e978948c86d58fa773af728fbd5e497b82b
Author: hsaputra hsapu...@apache.org
Date:   2015-04-22T21:10:29Z

Add DEPRECATED annotations in Spargel APIs and update the doc to refer to 
Gelly for Flink graph processing.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add DEPRECATED annotations in Spargel APIs and...

2015-04-22 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/618#issuecomment-95337327
  
CC @vasia 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1933) Add distance measure interface and basic implementation to machine learning library

2015-04-22 Thread Chiwan Park (JIRA)
Chiwan Park created FLINK-1933:
--

 Summary: Add distance measure interface and basic implementation 
to machine learning library
 Key: FLINK-1933
 URL: https://issues.apache.org/jira/browse/FLINK-1933
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Chiwan Park


Add distance measure interface to calculate distance between two vectors and 
some implementations of the interface. In FLINK-1745, [~till.rohrmann] suggests 
a interface following:

{code}
trait DistanceMeasure {
  def distance(a: Vector, b: Vector): Double
}
{code}

I think that following list of implementation is sufficient to provide first to 
ML library users.

* Manhattan distance [1]
* Cosine distance [2]
* Euclidean distance (and Squared) [3]
* Tanimoto distance [4]
* Minkowski distance [5]
* Chebyshev distance [6]

[1]: http://en.wikipedia.org/wiki/Taxicab_geometry
[2]: http://en.wikipedia.org/wiki/Cosine_similarity
[3]: http://en.wikipedia.org/wiki/Euclidean_distance
[4]: 
http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29
[5]: http://en.wikipedia.org/wiki/Minkowski_distance
[6]: http://en.wikipedia.org/wiki/Chebyshev_distance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...

2015-04-22 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/617#issuecomment-95162838
  
Are we sure that HBase is still working?

I think there are no automated tests for Hbase ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1875) Add figure to documentation describing slots and parallelism

2015-04-22 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1875.
---
Resolution: Fixed

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/db608332

 Add figure to documentation describing slots and parallelism
 

 Key: FLINK-1875
 URL: https://issues.apache.org/jira/browse/FLINK-1875
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 Our users are still confused how parallelism and slots are connected to each 
 other.
 We tried addressing this issue already with FLINK-1679, but I think we also 
 need to have a nice picture in our documentation.
 This is too complicated: 
 http://ci.apache.org/projects/flink/flink-docs-master/internal_job_scheduling.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging

2015-04-22 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1923:
-

 Summary: Replace asynchronous logging in JobManager with regular 
slf4j logging
 Key: FLINK-1923
 URL: https://issues.apache.org/jira/browse/FLINK-1923
 Project: Flink
  Issue Type: Task
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger


Its hard to understand exactly whats going on in the JobManager because the log 
messages are send asynchronously by a logging actor.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD

2015-04-22 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1925:


 Summary: Split SubmitTask method up into two phases: Receive TDD 
and instantiation of TDD
 Key: FLINK-1925
 URL: https://issues.apache.org/jira/browse/FLINK-1925
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann


ResearchGate reported that a job times out while submitting tasks to the 
TaskManager. The reason is that the JobManager expects a TaskOperationResult 
response upon submitting a task to the TM. The TM downloads then the required 
jars from the JM which blocks the actor thread and can take a very long time if 
many TMs download from the JM. Due to this, the SubmitTask future throws a 
TimeOutException.

A possible solution could be that the TM eagerly acknowledges the reception of 
the SubmitTask message and executes the task initialization within a future. 
The future will upon completion send a UpdateTaskExecutionState message to the 
JM which switches the state of the task from deploying to running. This means 
that the handler of SubmitTask future in {{Execution}} won't change the state 
of the task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1924] Minor Refactoring

2015-04-22 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/616

[FLINK-1924] Minor Refactoring

This PR resolves a few minor issues, including
formatting
simpler python process initialization
renaming of the python connection following the switch to tcp

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink python_refactor2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/616.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #616


commit 81d4c197bab6de4fb45f19ba2ca06f1f042c1812
Author: zentol s.mo...@web.de
Date:   2015-03-27T10:53:23Z

[FLINK-1924] Minor Refactoring




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1924) [Py] Refactor a few minor things

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506778#comment-14506778
 ] 

ASF GitHub Bot commented on FLINK-1924:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/616

[FLINK-1924] Minor Refactoring

This PR resolves a few minor issues, including
formatting
simpler python process initialization
renaming of the python connection following the switch to tcp

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink python_refactor2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/616.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #616


commit 81d4c197bab6de4fb45f19ba2ca06f1f042c1812
Author: zentol s.mo...@web.de
Date:   2015-03-27T10:53:23Z

[FLINK-1924] Minor Refactoring




 [Py] Refactor a few minor things
 

 Key: FLINK-1924
 URL: https://issues.apache.org/jira/browse/FLINK-1924
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.8.1
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Trivial
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD

2015-04-22 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-1925:
-
Description: 
A user reported that a job times out while submitting tasks to the TaskManager. 
The reason is that the JobManager expects a TaskOperationResult response upon 
submitting a task to the TM. The TM downloads then the required jars from the 
JM which blocks the actor thread and can take a very long time if many TMs 
download from the JM. Due to this, the SubmitTask future throws a 
TimeOutException.

A possible solution could be that the TM eagerly acknowledges the reception of 
the SubmitTask message and executes the task initialization within a future. 
The future will upon completion send a UpdateTaskExecutionState message to the 
JM which switches the state of the task from deploying to running. This means 
that the handler of SubmitTask future in {{Execution}} won't change the state 
of the task.

  was:
ResearchGate reported that a job times out while submitting tasks to the 
TaskManager. The reason is that the JobManager expects a TaskOperationResult 
response upon submitting a task to the TM. The TM downloads then the required 
jars from the JM which blocks the actor thread and can take a very long time if 
many TMs download from the JM. Due to this, the SubmitTask future throws a 
TimeOutException.

A possible solution could be that the TM eagerly acknowledges the reception of 
the SubmitTask message and executes the task initialization within a future. 
The future will upon completion send a UpdateTaskExecutionState message to the 
JM which switches the state of the task from deploying to running. This means 
that the handler of SubmitTask future in {{Execution}} won't change the state 
of the task.


 Split SubmitTask method up into two phases: Receive TDD and instantiation of 
 TDD
 

 Key: FLINK-1925
 URL: https://issues.apache.org/jira/browse/FLINK-1925
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann

 A user reported that a job times out while submitting tasks to the 
 TaskManager. The reason is that the JobManager expects a TaskOperationResult 
 response upon submitting a task to the TM. The TM downloads then the required 
 jars from the JM which blocks the actor thread and can take a very long time 
 if many TMs download from the JM. Due to this, the SubmitTask future throws a 
 TimeOutException.
 A possible solution could be that the TM eagerly acknowledges the reception 
 of the SubmitTask message and executes the task initialization within a 
 future. The future will upon completion send a UpdateTaskExecutionState 
 message to the JM which switches the state of the task from deploying to 
 running. This means that the handler of SubmitTask future in {{Execution}} 
 won't change the state of the task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1924) [Py] Refactor a few minor things

2015-04-22 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-1924:

Affects Version/s: (was: 0.8.2)
   0.9
Fix Version/s: (was: 0.8.2)
   0.9

 [Py] Refactor a few minor things
 

 Key: FLINK-1924
 URL: https://issues.apache.org/jira/browse/FLINK-1924
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Trivial
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1924) [Py] Refactor a few minor things

2015-04-22 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-1924:

Affects Version/s: (was: 0.8.1)
   0.8.2
Fix Version/s: (was: 0.9)
   0.8.2

 [Py] Refactor a few minor things
 

 Key: FLINK-1924
 URL: https://issues.apache.org/jira/browse/FLINK-1924
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.8.2
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Trivial
 Fix For: 0.8.2






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1926) [Py] Sync Python API with other API's

2015-04-22 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-1926:

Summary: [Py] Sync Python API with other API's  (was: [Py] Sync Python API 
with otehr API's)

 [Py] Sync Python API with other API's
 -

 Key: FLINK-1926
 URL: https://issues.apache.org/jira/browse/FLINK-1926
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler

 Meta issue to track overall sync status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1926) [Py] Sync Python API with otehr API's

2015-04-22 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-1926:
---

 Summary: [Py] Sync Python API with otehr API's
 Key: FLINK-1926
 URL: https://issues.apache.org/jira/browse/FLINK-1926
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler


Meta issue to track overall sync status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1927) [Py] Rework operator distribution

2015-04-22 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-1927:
---

 Summary: [Py] Rework operator distribution
 Key: FLINK-1927
 URL: https://issues.apache.org/jira/browse/FLINK-1927
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Minor
 Fix For: 0.9


Currently, the python operator is created when execution the python plan file, 
serialized using dill and saved as a byte[] in the java function. It is then 
deserialized at runtime on each node.

The current implementation is fairly hacky, and imposes certain limitations 
that make it hard to work with. Chaining, or generally saving other user-code, 
always requires a separate deserialization step after deserializing the 
operator.

These issues can be easily circumvented by rebuilding the (python) plan on each 
node, instead of serializing the operator. The plan creation is deterministic, 
and every operator is uniquely identified by an ID that is already known to the 
java function.

This change will allow us to easily support custom serializers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1916) EOFException when running delta-iteration job

2015-04-22 Thread Stefan Bunk (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506732#comment-14506732
 ] 

Stefan Bunk commented on FLINK-1916:


Isn't the code I linked already quite minimal? What do you need?

 EOFException when running delta-iteration job
 -

 Key: FLINK-1916
 URL: https://issues.apache.org/jira/browse/FLINK-1916
 Project: Flink
  Issue Type: Bug
 Environment: 0.9-milestone-1
 Exception on the cluster, local execution works
Reporter: Stefan Bunk

 The delta-iteration program in [1] ends with an
 java.io.EOFException
   at 
 org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.nextSegment(InMemoryPartition.java:333)
   at 
 org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.advance(AbstractPagedOutputView.java:140)
   at 
 org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeByte(AbstractPagedOutputView.java:223)
   at 
 org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeLong(AbstractPagedOutputView.java:291)
   at 
 org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeDouble(AbstractPagedOutputView.java:307)
   at 
 org.apache.flink.api.common.typeutils.base.DoubleSerializer.serialize(DoubleSerializer.java:62)
   at 
 org.apache.flink.api.common.typeutils.base.DoubleSerializer.serialize(DoubleSerializer.java:26)
   at 
 org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:89)
   at 
 org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:29)
   at 
 org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:219)
   at 
 org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536)
   at 
 org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347)
   at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209)
   at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
   at java.lang.Thread.run(Thread.java:745)
 For logs and the accompanying mailing list discussion see below.
 When running with slightly different memory configuration, as hinted on the 
 mailing list, I sometimes also get this exception:
 19.Apr. 13:39:29 INFO  Task - IterationHead(WorksetIteration 
 (Resolved-Redirects)) (10/10) switched to FAILED : 
 java.lang.IndexOutOfBoundsException: Index: 161, Size: 161
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.resetTo(InMemoryPartition.java:352)
 at 
 org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.access$100(InMemoryPartition.java:301)
 at 
 org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:226)
 at 
 org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536)
 at 
 org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
 at java.lang.Thread.run(Thread.java:745)
 [1] Flink job: https://gist.github.com/knub/0bfec859a563009c1d57
 [2] Job manager logs: https://gist.github.com/knub/01e3a4b0edb8cde66ff4
 [3] One task manager's logs: https://gist.github.com/knub/8f2f953da95c8d7adefc
 [4] 
 http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/EOFException-when-running-Flink-job-td1092.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1922) Failed task deployment causes NPE on input split assignment

2015-04-22 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1922:
-

 Summary: Failed task deployment causes NPE on input split 
assignment
 Key: FLINK-1922
 URL: https://issues.apache.org/jira/browse/FLINK-1922
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger


The input split assignment code is returning {null} if the Task has failed, 
which is causing a NPE.

We should improve our error handling / reporting in that situation.

{code}
13:12:31,002 INFO  org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1   
 - Status of job c0b47ce41e9a85a628a628a3977705ef (Flink Java Job at Tue Apr 21 
13:10:36 UTC 2015) changed to FAILING Cannot deploy task - TaskManager not 
responding..

13:12:47,591 ERROR org.apache.flink.runtime.operators.RegularPactTask   
 - Error in task code:  CHAIN DataSource (at userMethod 
(org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at 
main(UserClass.java:111)) (20/50)
java.lang.RuntimeException: Requesting the next InputSplit failed.
at 
org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88)
at 
org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337)
at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.NullPointerException
at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301)
at 
org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83)
... 4 more
13:12:47,595 INFO  org.apache.flink.runtime.taskmanager.Task
 - CHAIN DataSource (at SomeMethod 
(org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at 
main(SomeClass.java:111)) (20/50) switched to FAILED : 
java.lang.RuntimeException: Requesting the next InputSplit failed.
at 
org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88)
at 
org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337)
at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136)
at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.NullPointerException
at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301)
at 
org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83)
... 4 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1891]Add the input storageDirectory emp...

2015-04-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/601


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1891) Add isEmpty check when the input dir

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506825#comment-14506825
 ] 

ASF GitHub Bot commented on FLINK-1891:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/601


 Add isEmpty check when the input dir
 

 Key: FLINK-1891
 URL: https://issues.apache.org/jira/browse/FLINK-1891
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: master
Reporter: Sibao Hong
Assignee: Sibao Hong

 Add the input storageDirectory empty check, if input of storageDirectory is 
 empty, we should use tmp as the base dir



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-703] Use complete element as join key

2015-04-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/572


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-703) Use complete element as join key.

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506826#comment-14506826
 ] 

ASF GitHub Bot commented on FLINK-703:
--

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/572


 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-703) Use complete element as join key.

2015-04-22 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-703:

Fix Version/s: (was: pre-apache)
   0.9

 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: 0.9


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-703) Use complete element as join key.

2015-04-22 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-703.
-
Resolution: Implemented

Implemented with 45e680c2b6c9c2f64ce55423b755a13d402ff8ba

 Use complete element as join key.
 -

 Key: FLINK-703
 URL: https://issues.apache.org/jira/browse/FLINK-703
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Assignee: Chiwan Park
Priority: Trivial
  Labels: github-import
 Fix For: pre-apache


 In some situations such as semi-joins it could make sense to use a complete 
 element as join key. 
 Currently this can be done using a key-selector function, but we could offer 
 a shortcut for that.
 This is not an urgent issue, but might be helpful.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/703
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, user satisfaction, 
 Milestone: Release 0.6 (unplanned)
 Created at: Thu Apr 17 23:40:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1891) Add isEmpty check when the input dir

2015-04-22 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1891.
--
   Resolution: Fixed
Fix Version/s: 0.9

Fixed with a0147c493cf210a0914c35200ebfacd47515374d

 Add isEmpty check when the input dir
 

 Key: FLINK-1891
 URL: https://issues.apache.org/jira/browse/FLINK-1891
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: master
Reporter: Sibao Hong
Assignee: Sibao Hong
 Fix For: 0.9


 Add the input storageDirectory empty check, if input of storageDirectory is 
 empty, we should use tmp as the base dir



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

2015-04-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/372


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506913#comment-14506913
 ] 

ASF GitHub Bot commented on FLINK-1486:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/372


 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
  Labels: usability
 Fix For: 0.9


 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1804) flink-quickstart-scala tests fail on scala-2.11 build profile on travis

2015-04-22 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506691#comment-14506691
 ] 

Robert Metzger commented on FLINK-1804:
---

I didn't see them lately. I'll close the issue for now.

 flink-quickstart-scala tests fail on scala-2.11 build profile on travis
 ---

 Key: FLINK-1804
 URL: https://issues.apache.org/jira/browse/FLINK-1804
 Project: Flink
  Issue Type: Task
  Components: Build System, Quickstarts
Affects Versions: 0.9
Reporter: Robert Metzger

 Travis builds on master started failing after the Scala 2.11 profile has been 
 added to Flink.
 For example: https://travis-ci.org/apache/flink/jobs/56312734
 The error:
 {code}
 [INFO] [INFO] --- scala-maven-plugin:3.1.4:compile (default) @ testArtifact 
 ---
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from 
 sonatype-apache
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from 
 sonatype-apache
 [INFO] [WARNING]  Expected all dependencies to require Scala version: 2.10.4
 [INFO] [WARNING]  com.twitter:chill_2.10:0.5.2 requires scala version: 2.10.4
 [INFO] [WARNING]  com.twitter:chill-avro_2.10:0.5.2 requires scala version: 
 2.10.4
 [INFO] [WARNING]  com.twitter:chill-bijection_2.10:0.5.2 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  com.twitter:bijection-core_2.10:0.7.2 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  com.twitter:bijection-avro_2.10:0.7.2 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.scala-lang:scala-reflect:2.10.4 requires scala version: 
 2.10.4
 [INFO] [WARNING]  org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.scala-lang:scala-compiler:2.10.4 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.scalamacros:quasiquotes_2.10:2.0.1 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.apache.flink:flink-streaming-scala:0.9-SNAPSHOT 
 requires scala version: 2.11.4
 [INFO] [WARNING] Multiple versions of scala libraries detected!
 [INFO] [INFO] 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala:-1:
  info: compiling
 [INFO] [INFO] Compiling 3 source files to 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes
  at 1427650524446
 [INFO] [ERROR] error: 
 [INFO] [INFO]  while compiling: 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala/org/apache/flink/archetypetest/SocketTextStreamWordCount.scala
 [INFO] [INFO] during phase: typer
 [INFO] [INFO]  library version: version 2.10.4
 [INFO] [INFO] compiler version: version 2.10.4
 [INFO] [INFO]   reconstructed args: -d 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes
  -classpath 
 

[jira] [Resolved] (FLINK-1804) flink-quickstart-scala tests fail on scala-2.11 build profile on travis

2015-04-22 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1804.
---
Resolution: Invalid

 flink-quickstart-scala tests fail on scala-2.11 build profile on travis
 ---

 Key: FLINK-1804
 URL: https://issues.apache.org/jira/browse/FLINK-1804
 Project: Flink
  Issue Type: Task
  Components: Build System, Quickstarts
Affects Versions: 0.9
Reporter: Robert Metzger

 Travis builds on master started failing after the Scala 2.11 profile has been 
 added to Flink.
 For example: https://travis-ci.org/apache/flink/jobs/56312734
 The error:
 {code}
 [INFO] [INFO] --- scala-maven-plugin:3.1.4:compile (default) @ testArtifact 
 ---
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from 
 sonatype-apache
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype
 [INFO] [INFO] artifact joda-time:joda-time: checking for updates from 
 sonatype-apache
 [INFO] [WARNING]  Expected all dependencies to require Scala version: 2.10.4
 [INFO] [WARNING]  com.twitter:chill_2.10:0.5.2 requires scala version: 2.10.4
 [INFO] [WARNING]  com.twitter:chill-avro_2.10:0.5.2 requires scala version: 
 2.10.4
 [INFO] [WARNING]  com.twitter:chill-bijection_2.10:0.5.2 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  com.twitter:bijection-core_2.10:0.7.2 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  com.twitter:bijection-avro_2.10:0.7.2 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.scala-lang:scala-reflect:2.10.4 requires scala version: 
 2.10.4
 [INFO] [WARNING]  org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.scala-lang:scala-compiler:2.10.4 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.scalamacros:quasiquotes_2.10:2.0.1 requires scala 
 version: 2.10.4
 [INFO] [WARNING]  org.apache.flink:flink-streaming-scala:0.9-SNAPSHOT 
 requires scala version: 2.11.4
 [INFO] [WARNING] Multiple versions of scala libraries detected!
 [INFO] [INFO] 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala:-1:
  info: compiling
 [INFO] [INFO] Compiling 3 source files to 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes
  at 1427650524446
 [INFO] [ERROR] error: 
 [INFO] [INFO]  while compiling: 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala/org/apache/flink/archetypetest/SocketTextStreamWordCount.scala
 [INFO] [INFO] during phase: typer
 [INFO] [INFO]  library version: version 2.10.4
 [INFO] [INFO] compiler version: version 2.10.4
 [INFO] [INFO]   reconstructed args: -d 
 /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes
  -classpath 
 

[jira] [Created] (FLINK-1920) Passing -D akka.ask.timeout=5 min to yarn client does not work

2015-04-22 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1920:
-

 Summary: Passing -D akka.ask.timeout=5 min to yarn client does not 
work
 Key: FLINK-1920
 URL: https://issues.apache.org/jira/browse/FLINK-1920
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger


Thats probably an issue of the command line parsing.

Variations like -D akka.ask.timeout=5 min or  -D akka.ask.timeout=5 min are 
also not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1921) Rework parallelism/slots handling for per-job YARN sessions

2015-04-22 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1921:
-

 Summary: Rework parallelism/slots handling for per-job YARN 
sessions
 Key: FLINK-1921
 URL: https://issues.apache.org/jira/browse/FLINK-1921
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Robert Metzger


Right now, the -p argument is overwriting the -ys argument for per job yarn 
sessions.


Also, the priorities for parallelism should be documented:
low to high
1. flink-conf.yaml (-D arguments on YARN)
2. -p on ./bin/flink
3. ExecutionEnvironment.setParallelism()
4. Operator.setParallelism().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

2015-04-22 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/372#issuecomment-95153627
  
I think you are right. If there's only one sink active, there is no need 
for a sink identifier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506968#comment-14506968
 ] 

ASF GitHub Bot commented on FLINK-1486:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/372#issuecomment-95153627
  
I think you are right. If there's only one sink active, there is no need 
for a sink identifier.


 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
  Labels: usability
 Fix For: 0.9


 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1878) Add mode to Environments to deactivate sysout printing

2015-04-22 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506978#comment-14506978
 ] 

Stephan Ewen commented on FLINK-1878:
-

Complemented in b70431239a5e18555866addb41ee6edf2b79ff60

 Add mode to Environments to deactivate sysout printing
 --

 Key: FLINK-1878
 URL: https://issues.apache.org/jira/browse/FLINK-1878
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9


 The test output is currently spoiled for debugging by all the sysout output 
 from the RemoteEnvironment-based tests
 The execution environment should offer a mode to activate/deactivate printing 
 to System.out  - for tests, we should deactivate this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment

2015-04-22 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1918.
-
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Stephan Ewen

Fixed via 2b8db40ac40d70027ce331f3a04c6ca7aa562a84

 NullPointerException at org.apache.flink.client.program.Client's constructor 
 while using ExecutionEnvironment.createRemoteEnvironment
 -

 Key: FLINK-1918
 URL: https://issues.apache.org/jira/browse/FLINK-1918
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Zoltán Zvara
Assignee: Stephan Ewen
  Labels: yarn, yarn-client
 Fix For: 0.9


 Trace:
 {code}
 Exception in thread main java.lang.NullPointerException
   at org.apache.flink.client.program.Client.init(Client.java:104)
   at 
 org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86)
   at 
 org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82)
   at 
 org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70)
   at Wordcount.main(Wordcount.java:23)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
 {code}
 The constructor is trying to set configuration parameter 
 {{jobmanager.rpc.address}} with 
 {{jobManagerAddress.getAddress().getHostAddress()}}, but 
 {{jobManagerAddress.holder.addr}} is {{null}}. 
 {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid 
 information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment

2015-04-22 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-1918.
---

 NullPointerException at org.apache.flink.client.program.Client's constructor 
 while using ExecutionEnvironment.createRemoteEnvironment
 -

 Key: FLINK-1918
 URL: https://issues.apache.org/jira/browse/FLINK-1918
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Zoltán Zvara
Assignee: Stephan Ewen
  Labels: yarn, yarn-client
 Fix For: 0.9


 Trace:
 {code}
 Exception in thread main java.lang.NullPointerException
   at org.apache.flink.client.program.Client.init(Client.java:104)
   at 
 org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86)
   at 
 org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82)
   at 
 org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70)
   at Wordcount.main(Wordcount.java:23)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
 {code}
 The constructor is trying to set configuration parameter 
 {{jobmanager.rpc.address}} with 
 {{jobManagerAddress.getAddress().getHostAddress()}}, but 
 {{jobManagerAddress.holder.addr}} is {{null}}. 
 {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid 
 information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Periodic full stream aggregates added + partit...

2015-04-22 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/614#issuecomment-95156797
  
LGTM, really nice feature. Maybe we should also discourage or even diasble 
the current full stream reduces where the state might grow infinite now that we 
have this option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment

2015-04-22 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506983#comment-14506983
 ] 

Stephan Ewen commented on FLINK-1918:
-

[~Ehnalis] It should be fixed in the the latest master-

You can compile your own, or wait a few hours until Travis/Apache have synced 
the maven repositories with the new version.

 NullPointerException at org.apache.flink.client.program.Client's constructor 
 while using ExecutionEnvironment.createRemoteEnvironment
 -

 Key: FLINK-1918
 URL: https://issues.apache.org/jira/browse/FLINK-1918
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Zoltán Zvara
Assignee: Stephan Ewen
  Labels: yarn, yarn-client
 Fix For: 0.9


 Trace:
 {code}
 Exception in thread main java.lang.NullPointerException
   at org.apache.flink.client.program.Client.init(Client.java:104)
   at 
 org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86)
   at 
 org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82)
   at 
 org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70)
   at Wordcount.main(Wordcount.java:23)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
 {code}
 The constructor is trying to set configuration parameter 
 {{jobmanager.rpc.address}} with 
 {{jobManagerAddress.getAddress().getHostAddress()}}, but 
 {{jobManagerAddress.holder.addr}} is {{null}}. 
 {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid 
 information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1909] [streaming] Type handling refacto...

2015-04-22 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/610#issuecomment-95157189
  
LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1909) Refactor streaming scala api to use returns for adding typeinfo

2015-04-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506986#comment-14506986
 ] 

ASF GitHub Bot commented on FLINK-1909:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/610#issuecomment-95157189
  
LGTM.


 Refactor streaming scala api to use returns for adding typeinfo
 ---

 Key: FLINK-1909
 URL: https://issues.apache.org/jira/browse/FLINK-1909
 Project: Flink
  Issue Type: Improvement
  Components: Scala API, Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora

 Currently the streaming scala api uses transform to pass the extracted type 
 information instead of .returns. This leads to a lot of code duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)