date:20150427


[ 
https://issues.apache.org/jira/browse/FLINK-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513714#comment-14513714
 ] 

ASF GitHub Bot commented on FLINK-1682:
---

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/627

[FLINK-1682] Ported optimizer unit tests from Record API to Java API

This is a step towards removing the deprecated Record API.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink recordOptimizerTests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/627.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #627


commit 303e86cc7af2604c3d9c781bd08a656daf6b99ae
Author: Fabian Hueske fhue...@apache.org
Date:   2015-04-24T23:30:11Z

[FLINK-1682] Ported optimizer unit tests from Record API to Java API




 Port Record-API based optimizer tests to new Java API
 -

 Key: FLINK-1682
 URL: https://issues.apache.org/jira/browse/FLINK-1682
 Project: Flink
  Issue Type: Sub-task
  Components: Optimizer
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1789) Allow adding of URLs to the usercode class loader


[ 
https://issues.apache.org/jira/browse/FLINK-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513870#comment-14513870
 ] 

ASF GitHub Bot commented on FLINK-1789:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/593#issuecomment-96599424
  
Yes, user-facing remains strings which are file paths.


 Allow adding of URLs to the usercode class loader
 -

 Key: FLINK-1789
 URL: https://issues.apache.org/jira/browse/FLINK-1789
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Reporter: Timo Walther
Assignee: Timo Walther
Priority: Minor

 Currently, there is no option to add customs classpath URLs to the 
 FlinkUserCodeClassLoader. JARs always need to be shipped to the cluster even 
 if they are already present on all nodes.
 It would be great if RemoteEnvironment also accepts valid classpaths URLs and 
 forwards them to BlobLibraryCacheManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1789] [core] [runtime] [java-api] Allow...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/593#issuecomment-96599424
  
Yes, user-facing remains strings which are file paths.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-1947) Make Avro and Tachyon test logging less verbose

Till Rohrmann created FLINK-1947:


 Summary: Make Avro and Tachyon test logging less verbose
 Key: FLINK-1947
 URL: https://issues.apache.org/jira/browse/FLINK-1947
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Priority: Minor


Currently, the {{AvroExternalJarProgramITCase}} and the Tachyon test cases 
write the cluster status messages to stdout. I think these messages are not 
needed and only clutter the test output. Therefore, we should maybe suppress 
these messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1938] Add Grunt for building the front-...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/623#issuecomment-96562609
  
You are definitely welcome to participate in the new web frontend 
development. Have a look at the branch I sent you and see if you find your way 
around the new code (it is not too much so far, so it is okay, hopefully).

I am not familiar with Grunt. It says it is a JavaScript automation 
framework. What is it for, in the context of the web frontend?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1923] Replaces asynchronous logging wit...

2015-04-27 Thread tillrohrmann

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/628

[FLINK-1923] Replaces asynchronous logging with synchronous logging in 
actors

Replaces asynchronous logging with synchronous logging in actors. 
Additionally, all Scala implementations are now using the grizzled-slf4j 
SLF4J-wrapper which allows proper usage of SLF4J within Scala code.

One problem the grizzled-slf4j wrapper fixes is the ambiguity between 
varargs and a string with two placeholders. Additionally it resolves the 
ambiguity between (String, String, Object) where the first String is used as a 
Marker and where the first string is the logging message.

Grizzled-slf4j adds automatically logging guards which make the explicit 
checking for the log level redundant.

Grizzled-slf4j's license is BSD.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixLogging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/628.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #628


commit 34c9fd54448b340505c97679299b8876f0257828
Author: Till Rohrmann trohrm...@apache.org
Date:   2015-04-24T14:33:34Z

[FLINK-1923] [runtime] Replaces asynchronous logging with synchronous 
logging using grizzled-slf4j wrapper for Scala.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging


[ 
https://issues.apache.org/jira/browse/FLINK-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513746#comment-14513746
 ] 

ASF GitHub Bot commented on FLINK-1923:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/628

[FLINK-1923] Replaces asynchronous logging with synchronous logging in 
actors

Replaces asynchronous logging with synchronous logging in actors. 
Additionally, all Scala implementations are now using the grizzled-slf4j 
SLF4J-wrapper which allows proper usage of SLF4J within Scala code.

One problem the grizzled-slf4j wrapper fixes is the ambiguity between 
varargs and a string with two placeholders. Additionally it resolves the 
ambiguity between (String, String, Object) where the first String is used as a 
Marker and where the first string is the logging message.

Grizzled-slf4j adds automatically logging guards which make the explicit 
checking for the log level redundant.

Grizzled-slf4j's license is BSD.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixLogging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/628.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #628


commit 34c9fd54448b340505c97679299b8876f0257828
Author: Till Rohrmann trohrm...@apache.org
Date:   2015-04-24T14:33:34Z

[FLINK-1923] [runtime] Replaces asynchronous logging with synchronous 
logging using grizzled-slf4j wrapper for Scala.




 Replace asynchronous logging in JobManager with regular slf4j logging
 -

 Key: FLINK-1923
 URL: https://issues.apache.org/jira/browse/FLINK-1923
 Project: Flink
  Issue Type: Task
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Till Rohrmann

 Its hard to understand exactly whats going on in the JobManager because the 
 log messages are send asynchronously by a logging actor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1924] Minor Refactoring

2015-04-27 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/616#issuecomment-96573887
  
+1, can you merge it @mxm 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1682) Port Record-API based optimizer tests to new Java API


[ 
https://issues.apache.org/jira/browse/FLINK-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513773#comment-14513773
 ] 

ASF GitHub Bot commented on FLINK-1682:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/627#issuecomment-96580250
  
Wow, very nice :-) Looks correct from a first glance (I did not thoroughly 
check everything).

One remark: Since `print()` may become an eagerly evaluated command in the 
future, the tests become more robust when using `.output(new 
DiscardingOutputFormat())` as the sink.



 Port Record-API based optimizer tests to new Java API
 -

 Key: FLINK-1682
 URL: https://issues.apache.org/jira/browse/FLINK-1682
 Project: Flink
  Issue Type: Sub-task
  Components: Optimizer
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1789] [core] [runtime] [java-api] Allow...

2015-04-27 Thread twalthr

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/593#issuecomment-96591270
  
It is not a problem in exposing this to the client as well, I was just 
unsure if it is necessary. I will add to the client as well to be in sync.
Yes, it makes sense to use URLs instead of  URLs and Files. But the user 
facing APIs remain Strings right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1947) Make Avro and Tachyon test logging less verbose

2015-04-27 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513768#comment-14513768
 ] 

Stephan Ewen commented on FLINK-1947:
-

You can avoid the sysout logging by using 
{{ExecutionEnvironment.getConfig().disableSystoutLogging()}}.

 Make Avro and Tachyon test logging less verbose
 ---

 Key: FLINK-1947
 URL: https://issues.apache.org/jira/browse/FLINK-1947
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Priority: Minor

 Currently, the {{AvroExternalJarProgramITCase}} and the Tachyon test cases 
 write the cluster status messages to stdout. I think these messages are not 
 needed and only clutter the test output. Therefore, we should maybe suppress 
 these messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/617


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1682] Ported optimizer unit tests from ...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/627#issuecomment-96580250
  
Wow, very nice :-) Looks correct from a first glance (I did not thoroughly 
check everything).

One remark: Since `print()` may become an eagerly evaluated command in the 
future, the tests become more robust when using `.output(new 
DiscardingOutputFormat())` as the sink.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1789) Allow adding of URLs to the usercode class loader


[ 
https://issues.apache.org/jira/browse/FLINK-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513849#comment-14513849
 ] 

ASF GitHub Bot commented on FLINK-1789:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/593#issuecomment-96591270
  
It is not a problem in exposing this to the client as well, I was just 
unsure if it is necessary. I will add to the client as well to be in sync.
Yes, it makes sense to use URLs instead of  URLs and Files. But the user 
facing APIs remain Strings right?


 Allow adding of URLs to the usercode class loader
 -

 Key: FLINK-1789
 URL: https://issues.apache.org/jira/browse/FLINK-1789
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Reporter: Timo Walther
Assignee: Timo Walther
Priority: Minor

 Currently, there is no option to add customs classpath URLs to the 
 FlinkUserCodeClassLoader. JARs always need to be shipped to the cluster even 
 if they are already present on all nodes.
 It would be great if RemoteEnvironment also accepts valid classpaths URLs and 
 forwards them to BlobLibraryCacheManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1941) Add documentation for Gelly-GSA

2015-04-27 Thread Vasia Kalavri (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514777#comment-14514777
 ] 

Vasia Kalavri commented on FLINK-1941:
--

:-) you're way too fast for my current reviewing capacity!
I should be able to work on this one by the end of this week or beginning of 
next..

 Add documentation for Gelly-GSA
 ---

 Key: FLINK-1941
 URL: https://issues.apache.org/jira/browse/FLINK-1941
 Project: Flink
  Issue Type: Task
  Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
  Labels: docs, gelly

 Add a section in the Gelly guide to describe the newly introduced 
 Gather-Sum-Apply iteration method. Show how GSA uses delta iterations 
 internally and explain the differences of this model as compared to 
 vertex-centric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat

2015-04-27 Thread Elbehery

Github user Elbehery commented on the pull request:

https://github.com/apache/flink/pull/621#issuecomment-96809741
  
@aljoscha  I have run code format from Intellij, I think the problem should 
be solved now .. 

I have tried to read the travis build log, but I could not find the cause 
of the problem, would you please tell me how to find it, for saving time in the 
future ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets


[ 
https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514915#comment-14514915
 ] 

ASF GitHub Bot commented on FLINK-1615:
---

Github user Elbehery commented on the pull request:

https://github.com/apache/flink/pull/621#issuecomment-96809741
  
@aljoscha  I have run code format from Intellij, I think the problem should 
be solved now .. 

I have tried to read the travis build log, but I could not find the cause 
of the problem, would you please tell me how to find it, for saving time in the 
future ?


 Introduces a new InputFormat for Tweets
 ---

 Key: FLINK-1615
 URL: https://issues.apache.org/jira/browse/FLINK-1615
 Project: Flink
  Issue Type: New Feature
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Minor

 An event-driven parser for Tweets into Java Pojos. 
 It parses all the important part of the tweet into Java objects. 
 Tested on cluster and the performance in pretty well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table


[ 
https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14515048#comment-14515048
 ] 

ASF GitHub Bot commented on FLINK-1828:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/571


 Impossible to output data to an HBase table
 ---

 Key: FLINK-1828
 URL: https://issues.apache.org/jira/browse/FLINK-1828
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9
Reporter: Flavio Pompermaier
  Labels: hadoop, hbase
 Fix For: 0.9


 Right now it is not possible to use HBase TableOutputFormat as output format 
 because Configurable.setConf  is not called in the configure() method of the 
 HadoopOutputFormatBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table


[ 
https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14515050#comment-14515050
 ] 

ASF GitHub Bot commented on FLINK-1828:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/571#issuecomment-96828941
  
This PR was split into PR #632 and PR #633 


 Impossible to output data to an HBase table
 ---

 Key: FLINK-1828
 URL: https://issues.apache.org/jira/browse/FLINK-1828
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9
Reporter: Flavio Pompermaier
  Labels: hadoop, hbase
 Fix For: 0.9


 Right now it is not possible to use HBase TableOutputFormat as output format 
 because Configurable.setConf  is not called in the configure() method of the 
 HadoopOutputFormatBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/571#issuecomment-96828941
  
This PR was split into PR #632 and PR #633 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-1951) NullPointerException in DeltaIteration when no ForwardedFileds

2015-04-27 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1951:
-
Priority: Critical  (was: Minor)

 NullPointerException in DeltaIteration when no ForwardedFileds
 --

 Key: FLINK-1951
 URL: https://issues.apache.org/jira/browse/FLINK-1951
 Project: Flink
  Issue Type: Bug
  Components: Iterations
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Fabian Hueske
Priority: Critical

 The following exception is thrown by the Connected Components example, if the 
 @ForwardedFieldsFirst(*) annotation from the ComponentIdFilter join is 
 removed:
 Caused by: java.lang.NullPointerException
   at 
 org.apache.flink.examples.java.graph.ConnectedComponents$ComponentIdFilter.join(ConnectedComponents.java:186)
   at 
 org.apache.flink.examples.java.graph.ConnectedComponents$ComponentIdFilter.join(ConnectedComponents.java:1)
   at 
 org.apache.flink.runtime.operators.JoinWithSolutionSetSecondDriver.run(JoinWithSolutionSetSecondDriver.java:198)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
   at 
 org.apache.flink.runtime.iterative.task.AbstractIterativePactTask.run(AbstractIterativePactTask.java:139)
   at 
 org.apache.flink.runtime.iterative.task.IterationIntermediatePactTask.run(IterationIntermediatePactTask.java:92)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
   at java.lang.Thread.run(Thread.java:745)
 [Code | https://github.com/vasia/flink/tree/cc-test] and [dataset | 
 http://snap.stanford.edu/data/com-DBLP.html] to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Fixed missing call to configure() for Configur...

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/632


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/571


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (FLINK-1828) Impossible to output data to an HBase table

2015-04-27 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1828.
--
   Resolution: Fixed
Fix Version/s: 0.8.2

Fixed in
- 0.9 with de573cf5cef3bed6c489af85dba2cc61912db4c0
- 0.8.2 with ffc86f6686520b8ed4270ccd76ac304f64368c6e

 Impossible to output data to an HBase table
 ---

 Key: FLINK-1828
 URL: https://issues.apache.org/jira/browse/FLINK-1828
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9
Reporter: Flavio Pompermaier
  Labels: hadoop, hbase
 Fix For: 0.9, 0.8.2


 Right now it is not possible to use HBase TableOutputFormat as output format 
 because Configurable.setConf  is not called in the configure() method of the 
 HadoopOutputFormatBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets


[ 
https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513718#comment-14513718
 ] 

ASF GitHub Bot commented on FLINK-1615:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/621#issuecomment-96565454
  
The tests are failing because you use spaces in you code for indentation. 
Could you please change all indentation to tabs to satisfy the style checker?


 Introduces a new InputFormat for Tweets
 ---

 Key: FLINK-1615
 URL: https://issues.apache.org/jira/browse/FLINK-1615
 Project: Flink
  Issue Type: New Feature
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Minor

 An event-driven parser for Tweets into Java Pojos. 
 It parses all the important part of the tweet into Java objects. 
 Tested on cluster and the performance in pretty well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration


[ 
https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513703#comment-14513703
 ] 

ASF GitHub Bot commented on FLINK-1930:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/624#issuecomment-96555080
  
Thanks. I'm merging this.

(The failed Travis jobs are Travis-related I think.)


 NullPointerException in vertex-centric iteration
 

 Key: FLINK-1930
 URL: https://issues.apache.org/jira/browse/FLINK-1930
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Ufuk Celebi

 Hello to my Squirrels,
 I came across this exception when having a vertex-centric iteration output 
 followed by a group by. 
 I'm not sure if what is causing it, since I saw this error in a rather large 
 pipeline, but I managed to reproduce it with [this code example | 
 https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309]
  and a sufficiently large dataset, e.g. [this one | 
 http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally).
 It seems like a null Buffer in RecordWriter.
 The exception message is the following:
 Exception in thread main 
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
 at 
 org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
 at akka.dispatch.Mailbox.run(Mailbox.scala:221)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
 at 
 org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
 at 
 org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1946) Make yarn tests logging less verbose

Till Rohrmann created FLINK-1946:


 Summary: Make yarn tests logging less verbose
 Key: FLINK-1946
 URL: https://issues.apache.org/jira/browse/FLINK-1946
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Priority: Minor


Currently, the yarn tests log on the INFO level making the test outputs 
confusing. Furthermore some status messages are written to stdout. I think 
these messages are not necessary to be shown to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat

2015-04-27 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/621#issuecomment-96565454
  
The tests are failing because you use spaces in you code for indentation. 
Could you please change all indentation to tabs to satisfy the style checker?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1789] [core] [runtime] [java-api] Allow...

2015-04-27 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/593#issuecomment-96532907
  
Yes, this would make things a lot cleaner. @twalthr what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1789) Allow adding of URLs to the usercode class loader


[ 
https://issues.apache.org/jira/browse/FLINK-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513657#comment-14513657
 ] 

ASF GitHub Bot commented on FLINK-1789:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/593#issuecomment-96532907
  
Yes, this would make things a lot cleaner. @twalthr what do you think?


 Allow adding of URLs to the usercode class loader
 -

 Key: FLINK-1789
 URL: https://issues.apache.org/jira/browse/FLINK-1789
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Reporter: Timo Walther
Assignee: Timo Walther
Priority: Minor

 Currently, there is no option to add customs classpath URLs to the 
 FlinkUserCodeClassLoader. JARs always need to be shipped to the cluster even 
 if they are already present on all nodes.
 It would be great if RemoteEnvironment also accepts valid classpaths URLs and 
 forwards them to BlobLibraryCacheManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1938) Add Grunt for building the front-end


[ 
https://issues.apache.org/jira/browse/FLINK-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513713#comment-14513713
 ] 

ASF GitHub Bot commented on FLINK-1938:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/623#issuecomment-96562609
  
You are definitely welcome to participate in the new web frontend 
development. Have a look at the branch I sent you and see if you find your way 
around the new code (it is not too much so far, so it is okay, hopefully).

I am not familiar with Grunt. It says it is a JavaScript automation 
framework. What is it for, in the context of the web frontend?


 Add Grunt for building the front-end
 

 Key: FLINK-1938
 URL: https://issues.apache.org/jira/browse/FLINK-1938
 Project: Flink
  Issue Type: Improvement
  Components: Build System, Webfrontend
Reporter: Vikhyat Korrapati
Priority: Minor

 This is the first step towards implementing the web interface refactoring I 
 proposed last year: 
 https://groups.google.com/forum/#!topic/stratosphere-dev/GeXmDXF9DOY
 Once this is merged, I can get started with the rest of the refactoring. For 
 now, the actual interface is kept the same, the only change is to how the 
 build is done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1682] Ported optimizer unit tests from ...

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/627

[FLINK-1682] Ported optimizer unit tests from Record API to Java API

This is a step towards removing the deprecated Record API.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink recordOptimizerTests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/627.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #627


commit 303e86cc7af2604c3d9c781bd08a656daf6b99ae
Author: Fabian Hueske fhue...@apache.org
Date:   2015-04-24T23:30:11Z

[FLINK-1682] Ported optimizer unit tests from Record API to Java API




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1924) [Py] Refactor a few minor things


[ 
https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513747#comment-14513747
 ] 

ASF GitHub Bot commented on FLINK-1924:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/616#issuecomment-96573887
  
+1, can you merge it @mxm 


 [Py] Refactor a few minor things
 

 Key: FLINK-1924
 URL: https://issues.apache.org/jira/browse/FLINK-1924
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Trivial
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1924) [Py] Refactor a few minor things


[ 
https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513732#comment-14513732
 ] 

ASF GitHub Bot commented on FLINK-1924:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/616#issuecomment-96570629
  
LGTM


 [Py] Refactor a few minor things
 

 Key: FLINK-1924
 URL: https://issues.apache.org/jira/browse/FLINK-1924
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Trivial
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1523) Vertex-centric iteration extensions

[
https://issues.apache.org/jira/browse/FLINK-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513707#comment-14513707
]

ASF GitHub Bot commented on FLINK-1523:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/537#issuecomment-96558324

Looks good in general. A few thoughts on this:

Your prior discussion involved efficiency, and Vasia suggested to not carry
the degrees in all cases (as they are not needed in most cases). It seems that
was not yet realized, because the Vertex class always carries the degrees.

I think we can improve the abstraction between the with-degree and
without-degree case a bit. Cases where one has to throw a not supported can
usually be improved with a good inheritance hierarchy. Does it work to make the
VertexWithDegrees class a subclass of the Vertex class?

Also, as a bit of background: Vertex is a subclass of Tuple2. Tuples are
currently the fastest data types in Flink. By adding additional Fields to the
Tuple, you are making it a POJO (as far as I know), which is a bit slower to
serialize type.

Also: primitives are usually better than boxed types. Prefer `long` over
`Long` where possible.

Vertex-centric iteration extensions
---

Key: FLINK-1523
URL: https://issues.apache.org/jira/browse/FLINK-1523
Project: Flink
Issue Type: Improvement
Components: Gelly
Reporter: Vasia Kalavri
Assignee: Andra Lungu

We would like to make the following extensions to the vertex-centric
iterations of Gelly:
- allow vertices to access their in/out degrees and the total number of
vertices of the graph, inside the iteration.
- allow choosing the neighborhood type (in/out/all) over which to run the
vertex-centric iteration. Now, the model uses the updates of the in-neighbors
to calculate state and send messages to out-neighbors. We could add a
parameter with value in/out/all to the {{VertexUpdateFunction}} and
{{MessagingFunction}}, that would indicate the type of neighborhood.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-1930) NullPointerException in vertex-centric iteration

2015-04-27 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi closed FLINK-1930.
--
Resolution: Fixed

Fixed in 88638de.

 NullPointerException in vertex-centric iteration
 

 Key: FLINK-1930
 URL: https://issues.apache.org/jira/browse/FLINK-1930
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Ufuk Celebi

 Hello to my Squirrels,
 I came across this exception when having a vertex-centric iteration output 
 followed by a group by. 
 I'm not sure if what is causing it, since I saw this error in a rather large 
 pipeline, but I managed to reproduce it with [this code example | 
 https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309]
  and a sufficiently large dataset, e.g. [this one | 
 http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally).
 It seems like a null Buffer in RecordWriter.
 The exception message is the following:
 Exception in thread main 
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
 at 
 org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
 at akka.dispatch.Mailbox.run(Mailbox.scala:221)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
 at 
 org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
 at 
 org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1924] Minor Refactoring

2015-04-27 Thread mxm

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/616#issuecomment-96570629
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging


 [ 
https://issues.apache.org/jira/browse/FLINK-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-1923:


Assignee: Till Rohrmann

 Replace asynchronous logging in JobManager with regular slf4j logging
 -

 Key: FLINK-1923
 URL: https://issues.apache.org/jira/browse/FLINK-1923
 Project: Flink
  Issue Type: Task
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Till Rohrmann

 Its hard to understand exactly whats going on in the JobManager because the 
 log messages are send asynchronously by a logging actor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1945) Make python tests less verbose

Till Rohrmann created FLINK-1945:


 Summary: Make python tests less verbose
 Key: FLINK-1945
 URL: https://issues.apache.org/jira/browse/FLINK-1945
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Priority: Minor


Currently, the python tests print a lot of log messages to stdout. Furthermore 
there seems to be some println statements which clutter the console output. I 
think that these log messages are not required for the tests and thus should be 
suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: yarn client tests

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/134


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: Flink 964

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/112


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration


[ 
https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513706#comment-14513706
 ] 

ASF GitHub Bot commented on FLINK-1930:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/624


 NullPointerException in vertex-centric iteration
 

 Key: FLINK-1930
 URL: https://issues.apache.org/jira/browse/FLINK-1930
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Ufuk Celebi

 Hello to my Squirrels,
 I came across this exception when having a vertex-centric iteration output 
 followed by a group by. 
 I'm not sure if what is causing it, since I saw this error in a rather large 
 pipeline, but I managed to reproduce it with [this code example | 
 https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309]
  and a sufficiently large dataset, e.g. [this one | 
 http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally).
 It seems like a null Buffer in RecordWriter.
 The exception message is the following:
 Exception in thread main 
 org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
 at 
 org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
 at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
 at 
 org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
 at akka.dispatch.Mailbox.run(Mailbox.scala:221)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
 at 
 org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
 at 
 org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405)
 at 
 org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365)
 at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1930] Separate output buffer pool and r...

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/624


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1523][gelly] Vertex centric iteration e...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/537#issuecomment-96558324
  
Looks good in general. A few thoughts on this:

Your prior discussion involved efficiency, and Vasia suggested to not carry 
the degrees in all cases (as they are not needed in most cases). It seems that 
was not yet realized, because the Vertex class always carries the degrees.

I think we can improve the abstraction between the with-degree and 
without-degree case a bit. Cases where one has to throw a not supported can 
usually be improved with a good inheritance hierarchy. Does it work to make the 
VertexWithDegrees class a subclass of the Vertex class?

Also, as a bit of background: Vertex is a subclass of Tuple2. Tuples are 
currently the fastest data types in Flink. By adding additional Fields to the 
Tuple, you are making it a POJO (as far as I know), which is a bit slower to 
serialize type.

Also: primitives are usually better than boxed types. Prefer `long` over 
`Long` where possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-1946) Make yarn tests logging less verbose


 [ 
https://issues.apache.org/jira/browse/FLINK-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1946:
--
Component/s: YARN Client

 Make yarn tests logging less verbose
 

 Key: FLINK-1946
 URL: https://issues.apache.org/jira/browse/FLINK-1946
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Reporter: Till Rohrmann
Priority: Minor

 Currently, the yarn tests log on the INFO level making the test outputs 
 confusing. Furthermore some status messages are written to stdout. I think 
 these messages are not necessary to be shown to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/571#discussion_r29129684
  
--- Diff: flink-staging/flink-hbase/pom.xml ---
@@ -112,6 +112,12 @@ under the License.
/exclusion
/exclusions
/dependency
+   dependency
--- End diff --

I do not have a HBase setup here. 
Could you try to exclude all dependencies of hbase-server and add them 
until it works? I hope the TableInputFormat and TableOutputFormat have not too 
many external dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table


[ 
https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513737#comment-14513737
 ] 

ASF GitHub Bot commented on FLINK-1828:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/571#discussion_r29129684
  
--- Diff: flink-staging/flink-hbase/pom.xml ---
@@ -112,6 +112,12 @@ under the License.
/exclusion
/exclusions
/dependency
+   dependency
--- End diff --

I do not have a HBase setup here. 
Could you try to exclude all dependencies of hbase-server and add them 
until it works? I hope the TableInputFormat and TableOutputFormat have not too 
many external dependencies.


 Impossible to output data to an HBase table
 ---

 Key: FLINK-1828
 URL: https://issues.apache.org/jira/browse/FLINK-1828
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9
Reporter: Flavio Pompermaier
  Labels: hadoop, hbase
 Fix For: 0.9


 Right now it is not possible to use HBase TableOutputFormat as output format 
 because Configurable.setConf  is not called in the configure() method of the 
 HadoopOutputFormatBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...

2015-04-27 Thread fpompermaier

Github user fpompermaier commented on a diff in the pull request:

https://github.com/apache/flink/pull/571#discussion_r29130488
  
--- Diff: flink-staging/flink-hbase/pom.xml ---
@@ -112,6 +112,12 @@ under the License.
/exclusion
/exclusions
/dependency
+   dependency
--- End diff --

Ok, I hope to be able to do it before this evening!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table


[ 
https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513753#comment-14513753
 ] 

ASF GitHub Bot commented on FLINK-1828:
---

Github user fpompermaier commented on a diff in the pull request:

https://github.com/apache/flink/pull/571#discussion_r29130488
  
--- Diff: flink-staging/flink-hbase/pom.xml ---
@@ -112,6 +112,12 @@ under the License.
/exclusion
/exclusions
/dependency
+   dependency
--- End diff --

Ok, I hope to be able to do it before this evening!


 Impossible to output data to an HBase table
 ---

 Key: FLINK-1828
 URL: https://issues.apache.org/jira/browse/FLINK-1828
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9
Reporter: Flavio Pompermaier
  Labels: hadoop, hbase
 Fix For: 0.9


 Right now it is not possible to use HBase TableOutputFormat as output format 
 because Configurable.setConf  is not called in the configure() method of the 
 HadoopOutputFormatBase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1925] Fixes blocking method submitTask ...

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/622#discussion_r29140659
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
 ---
@@ -795,108 +795,111 @@ extends Actor with ActorLogMessages with 
ActorLogging {
 var startRegisteringTask = 0L
 var task: Task = null
 
-// all operations are in a try / catch block to make sure we send a 
result upon any failure
-try {
-  // check that we are already registered
-  if (!isConnected) {
-throw new IllegalStateException(TaskManager is not associated 
with a JobManager)
-  }
-  if (slot  0 || slot = numberOfSlots) {
-throw new Exception(sTarget slot ${slot} does not exist on 
TaskManager.)
-  }
+if (!isConnected) {
+  sender ! Failure(
+new IllegalStateException(TaskManager is not associated with a 
JobManager.)
+  )
+} else if (slot  0 || slot = numberOfSlots) {
+  sender ! Failure(new Exception(sTarget slot $slot does not exist on 
TaskManager.))
+} else {
+  sender ! Acknowledge
 
-  val userCodeClassLoader = libraryCacheManager match {
-case Some(manager) =
-  if (LOG.isDebugEnabled) {
-startRegisteringTask = System.currentTimeMillis()
-  }
+  Future {
+try {
--- End diff --

Can we pull the code in the future into its own method? Makes it easier to 
understand by separating the different parts along the methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD


[ 
https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513981#comment-14513981
 ] 

ASF GitHub Bot commented on FLINK-1925:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/622#discussion_r29140659
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
 ---
@@ -795,108 +795,111 @@ extends Actor with ActorLogMessages with 
ActorLogging {
 var startRegisteringTask = 0L
 var task: Task = null
 
-// all operations are in a try / catch block to make sure we send a 
result upon any failure
-try {
-  // check that we are already registered
-  if (!isConnected) {
-throw new IllegalStateException(TaskManager is not associated 
with a JobManager)
-  }
-  if (slot  0 || slot = numberOfSlots) {
-throw new Exception(sTarget slot ${slot} does not exist on 
TaskManager.)
-  }
+if (!isConnected) {
+  sender ! Failure(
+new IllegalStateException(TaskManager is not associated with a 
JobManager.)
+  )
+} else if (slot  0 || slot = numberOfSlots) {
+  sender ! Failure(new Exception(sTarget slot $slot does not exist on 
TaskManager.))
+} else {
+  sender ! Acknowledge
 
-  val userCodeClassLoader = libraryCacheManager match {
-case Some(manager) =
-  if (LOG.isDebugEnabled) {
-startRegisteringTask = System.currentTimeMillis()
-  }
+  Future {
+try {
--- End diff --

Can we pull the code in the future into its own method? Makes it easier to 
understand by separating the different parts along the methods.


 Split SubmitTask method up into two phases: Receive TDD and instantiation of 
 TDD
 

 Key: FLINK-1925
 URL: https://issues.apache.org/jira/browse/FLINK-1925
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann

 A user reported that a job times out while submitting tasks to the 
 TaskManager. The reason is that the JobManager expects a TaskOperationResult 
 response upon submitting a task to the TM. The TM downloads then the required 
 jars from the JM which blocks the actor thread and can take a very long time 
 if many TMs download from the JM. Due to this, the SubmitTask future throws a 
 TimeOutException.
 A possible solution could be that the TM eagerly acknowledges the reception 
 of the SubmitTask message and executes the task initialization within a 
 future. The future will upon completion send a UpdateTaskExecutionState 
 message to the JM which switches the state of the task from deploying to 
 running. This means that the handler of SubmitTask future in {{Execution}} 
 won't change the state of the task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514099#comment-14514099
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29144391
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/IntParser.java ---
@@ -38,19 +38,28 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, I
 
final int delimLimit = limit-delimiter.length+1;
 
+   if (bytes.length == 0) {
--- End diff --

This check is not strictly necessary, IMO.
`bytes` is a larger byte array which is reused by the calling 
`GenericCsvInputFormat`.

To reduce the processing overhead of each field, I would omit the check 
(here and in the Long and Short parsers)


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514102#comment-14514102
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/566#issuecomment-96647676
  
Looks good. I added few minor comments inline. 
Did you check if the changes should also go into the `ByteParser`?


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging


[ 
https://issues.apache.org/jira/browse/FLINK-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513973#comment-14513973
 ] 

ASF GitHub Bot commented on FLINK-1923:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/628#issuecomment-96622572
  
I am curious, why did you rewrite the TaskManager? I thought that one was 
logging synchronously already.


 Replace asynchronous logging in JobManager with regular slf4j logging
 -

 Key: FLINK-1923
 URL: https://issues.apache.org/jira/browse/FLINK-1923
 Project: Flink
  Issue Type: Task
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Till Rohrmann

 Its hard to understand exactly whats going on in the JobManager because the 
 log messages are send asynchronously by a logging actor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1792] TM Monitoring: CPU utilization, h...

2015-04-27 Thread rmetzger

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/553#issuecomment-96631425
  
Thank you. I'm trying to find time to review your changes soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1792) Improve TM Monitoring: CPU utilization, hide graphs by default and show summary only


[ 
https://issues.apache.org/jira/browse/FLINK-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514030#comment-14514030
 ] 

ASF GitHub Bot commented on FLINK-1792:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/553#issuecomment-96631425
  
Thank you. I'm trying to find time to review your changes soon.


 Improve TM Monitoring: CPU utilization, hide graphs by default and show 
 summary only
 

 Key: FLINK-1792
 URL: https://issues.apache.org/jira/browse/FLINK-1792
 Project: Flink
  Issue Type: Sub-task
  Components: Webfrontend
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Sachin Bhat

 As per https://github.com/apache/flink/pull/421 from FLINK-1501, there are 
 some enhancements to the current monitoring required
 - Get the CPU utilization in % from each TaskManager process
 - Remove the metrics graph from the overview and only show the current stats 
 as numbers (cpu load, heap utilization) and add a button to enable the 
 detailed graph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (FLINK-1922) Failed task deployment causes NPE on input split assignment


 [ 
https://issues.apache.org/jira/browse/FLINK-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-1922:


Assignee: Till Rohrmann

 Failed task deployment causes NPE on input split assignment
 ---

 Key: FLINK-1922
 URL: https://issues.apache.org/jira/browse/FLINK-1922
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Till Rohrmann

 The input split assignment code is returning {null} if the Task has failed, 
 which is causing a NPE.
 We should improve our error handling / reporting in that situation.
 {code}
 13:12:31,002 INFO  org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 
- Status of job c0b47ce41e9a85a628a628a3977705ef (Flink Java Job at Tue 
 Apr 21 13:10:36 UTC 2015) changed to FAILING Cannot deploy task - TaskManager 
 not responding..
 
 13:12:47,591 ERROR org.apache.flink.runtime.operators.RegularPactTask 
- Error in task code:  CHAIN DataSource (at userMethod 
 (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at 
 main(UserClass.java:111)) (20/50)
 java.lang.RuntimeException: Requesting the next InputSplit failed.
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88)
   at 
 org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337)
   at 
 org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.NullPointerException
   at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106)
   at 
 org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301)
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83)
   ... 4 more
 13:12:47,595 INFO  org.apache.flink.runtime.taskmanager.Task  
- CHAIN DataSource (at SomeMethod 
 (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at 
 main(SomeClass.java:111)) (20/50) switched to FAILED : 
 java.lang.RuntimeException: Requesting the next InputSplit failed.
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88)
   at 
 org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337)
   at 
 org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.NullPointerException
   at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106)
   at 
 org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301)
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83)
   ... 4 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging


[ 
https://issues.apache.org/jira/browse/FLINK-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513971#comment-14513971
 ] 

ASF GitHub Bot commented on FLINK-1923:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/628#discussion_r29140183
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java ---
@@ -357,16 +356,6 @@ protected void unregisterTask() {
taskManager.tell(new UnregisterTask(executionId), 
ActorRef.noSender());
}
 
-   protected void notifyExecutionStateChange(ExecutionState executionState,
--- End diff --

Was this method unused?


 Replace asynchronous logging in JobManager with regular slf4j logging
 -

 Key: FLINK-1923
 URL: https://issues.apache.org/jira/browse/FLINK-1923
 Project: Flink
  Issue Type: Task
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Till Rohrmann

 Its hard to understand exactly whats going on in the JobManager because the 
 log messages are send asynchronously by a logging actor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1923] Replaces asynchronous logging wit...

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/628#discussion_r29140183
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java ---
@@ -357,16 +356,6 @@ protected void unregisterTask() {
taskManager.tell(new UnregisterTask(executionId), 
ActorRef.noSender());
}
 
-   protected void notifyExecutionStateChange(ExecutionState executionState,
--- End diff --

Was this method unused?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1682) Port Record-API based optimizer tests to new Java API


[ 
https://issues.apache.org/jira/browse/FLINK-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514045#comment-14514045
 ] 

ASF GitHub Bot commented on FLINK-1682:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/627#issuecomment-96631688
  
Yes, very good point.
I replaced the `.print()` statements by `.output(new 
DiscardingOutputFormat()` as suggested (not only in the ported tests but also 
some more on the way).


 Port Record-API based optimizer tests to new Java API
 -

 Key: FLINK-1682
 URL: https://issues.apache.org/jira/browse/FLINK-1682
 Project: Flink
  Issue Type: Sub-task
  Components: Optimizer
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Minor
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1682] Ported optimizer unit tests from ...

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/627#issuecomment-96631688
  
Yes, very good point.
I replaced the `.print()` statements by `.output(new 
DiscardingOutputFormat()` as suggested (not only in the ported tests but also 
some more on the way).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1922] Fixes NPE when TM receives a null...

2015-04-27 Thread tillrohrmann

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/631

[FLINK-1922] Fixes NPE when TM receives a null input split

The ```TaskInputSplitProvider``` did not handle null input splits which are 
wrapped into a ```NextInputSplit``` message properly. Fixed this.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixInputSplit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/631.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #631


commit dad90438f7508309122611f5d79ee9295d242a6f
Author: Till Rohrmann trohrm...@apache.org
Date:   2015-04-27T12:30:16Z

[FLINK-1922] [runtime] Fixes NPE when TM receives a null input split




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/566#issuecomment-96647676
  
Looks good. I added few minor comments inline. 
Did you check if the changes should also go into the `ByteParser`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-1945) Make python tests less verbose

2015-04-27 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-1945:
---

Assignee: Chesnay Schepler

 Make python tests less verbose
 --

 Key: FLINK-1945
 URL: https://issues.apache.org/jira/browse/FLINK-1945
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Chesnay Schepler
Priority: Minor

 Currently, the python tests print a lot of log messages to stdout. 
 Furthermore there seems to be some println statements which clutter the 
 console output. I think that these log messages are not required for the 
 tests and thus should be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514122#comment-14514122
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user FelixNeutatz commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29145755
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/IntParser.java ---
@@ -38,19 +38,28 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, I
 
final int delimLimit = limit-delimiter.length+1;
 
+   if (bytes.length == 0) {
--- End diff --

If I skip this check - the LongParserTest, ShortParserTest  ... will fail 
because of an out-of-bound-exception ...


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

2015-04-27 Thread FelixNeutatz

Github user FelixNeutatz commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29145755
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/IntParser.java ---
@@ -38,19 +38,28 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, I
 
final int delimLimit = limit-delimiter.length+1;
 
+   if (bytes.length == 0) {
--- End diff --

If I skip this check - the LongParserTest, ShortParserTest  ... will fail 
because of an out-of-bound-exception ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1933) Add distance measure interface and basic implementation to machine learning library


[ 
https://issues.apache.org/jira/browse/FLINK-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513953#comment-14513953
 ] 

ASF GitHub Bot commented on FLINK-1933:
---

GitHub user chiwanpark opened a pull request:

https://github.com/apache/flink/pull/629

[FLINK-1933] Add distance measure interface and basic implementation to 
machine learning library

This PR contains following changes:

* Add `dot` method and `magnitude` method.
* Add `DistanceMeasure` trait.
* Add 7 basic implementation of `DistanceMeasure`.
* Add tests for above changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chiwanpark/flink FLINK-1933

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/629.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #629


commit 3fac0ff339de93ab1c3b3582924af75a1e6057ea
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-04-24T03:50:28Z

[FLINK-1933] [ml] Add dot product and magnitude into Vector

commit c8f940c2439f754ef0e640b5440507bce4b859d2
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-04-27T09:48:56Z

[FLINK-1933] [ml] Add distance measure interface and basic implementation 
to machine learning library




 Add distance measure interface and basic implementation to machine learning 
 library
 ---

 Key: FLINK-1933
 URL: https://issues.apache.org/jira/browse/FLINK-1933
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Chiwan Park
  Labels: ML

 Add distance measure interface to calculate distance between two vectors and 
 some implementations of the interface. In FLINK-1745, [~till.rohrmann] 
 suggests a interface following:
 {code}
 trait DistanceMeasure {
   def distance(a: Vector, b: Vector): Double
 }
 {code}
 I think that following list of implementation is sufficient to provide first 
 to ML library users.
 * Manhattan distance [1]
 * Cosine distance [2]
 * Euclidean distance (and Squared) [3]
 * Tanimoto distance [4]
 * Minkowski distance [5]
 * Chebyshev distance [6]
 [1]: http://en.wikipedia.org/wiki/Taxicab_geometry
 [2]: http://en.wikipedia.org/wiki/Cosine_similarity
 [3]: http://en.wikipedia.org/wiki/Euclidean_distance
 [4]: 
 http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29
 [5]: http://en.wikipedia.org/wiki/Minkowski_distance
 [6]: http://en.wikipedia.org/wiki/Chebyshev_distance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1911] [streaming] Streaming projection ...

2015-04-27 Thread szape

GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/630

[FLINK-1911] [streaming] Streaming projection without types

Since the DataSet projection has been reworked to not require the 
.types(...) call the Streaming and Batch methods were out of sync.
So, the streaming API projection was modified accordingly.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1911

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/630.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #630


commit a36e7bfe8cf5d5a949714de215aaabf03494f62d
Author: szape nemderogator...@gmail.com
Date:   2015-04-20T14:53:46Z

[FLINK-1911] [streaming] Working streaming projection prototype without 
types




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1925] Fixes blocking method submitTask ...

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/622#discussion_r29140453
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
@@ -345,26 +346,9 @@ public void onComplete(Throwable failure, Object 
success) throws Throwable {
}
}
else {
-   if (success == null) {
-   markFailed(new 
Exception(Failed to deploy the task to slot  + slot + : TaskOperationResult 
was null));
-   }
-
-   if (success instanceof 
TaskOperationResult) {
-   TaskOperationResult 
result = (TaskOperationResult) success;
-
-   if 
(!result.executionID().equals(attemptId)) {
-   markFailed(new 
Exception(Answer execution id does not match the request execution id.));
-   } else if 
(result.success()) {
-   
switchToRunning();
-   } else {
-   // deployment 
failed :(
-   markFailed(new 
Exception(Failed to deploy the task  +
-   
getVertexWithAttempt() +  to slot  + slot + :  + result
-   
.description()));
-   }
-   } else {
+   if (!(success instanceof 
Messages.Acknowledge$)) {
--- End diff --

I think this line is not parsable in Eclipse (the $ mess with the Java 
parser).
A workaround is to expose the case object class and object via a utility 
method and check against that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1922) Failed task deployment causes NPE on input split assignment


[ 
https://issues.apache.org/jira/browse/FLINK-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514068#comment-14514068
 ] 

ASF GitHub Bot commented on FLINK-1922:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/631

[FLINK-1922] Fixes NPE when TM receives a null input split

The ```TaskInputSplitProvider``` did not handle null input splits which are 
wrapped into a ```NextInputSplit``` message properly. Fixed this.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixInputSplit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/631.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #631


commit dad90438f7508309122611f5d79ee9295d242a6f
Author: Till Rohrmann trohrm...@apache.org
Date:   2015-04-27T12:30:16Z

[FLINK-1922] [runtime] Fixes NPE when TM receives a null input split




 Failed task deployment causes NPE on input split assignment
 ---

 Key: FLINK-1922
 URL: https://issues.apache.org/jira/browse/FLINK-1922
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Till Rohrmann

 The input split assignment code is returning {null} if the Task has failed, 
 which is causing a NPE.
 We should improve our error handling / reporting in that situation.
 {code}
 13:12:31,002 INFO  org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 
- Status of job c0b47ce41e9a85a628a628a3977705ef (Flink Java Job at Tue 
 Apr 21 13:10:36 UTC 2015) changed to FAILING Cannot deploy task - TaskManager 
 not responding..
 
 13:12:47,591 ERROR org.apache.flink.runtime.operators.RegularPactTask 
- Error in task code:  CHAIN DataSource (at userMethod 
 (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at 
 main(UserClass.java:111)) (20/50)
 java.lang.RuntimeException: Requesting the next InputSplit failed.
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88)
   at 
 org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337)
   at 
 org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.NullPointerException
   at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106)
   at 
 org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301)
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83)
   ... 4 more
 13:12:47,595 INFO  org.apache.flink.runtime.taskmanager.Task  
- CHAIN DataSource (at SomeMethod 
 (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at 
 main(SomeClass.java:111)) (20/50) switched to FAILED : 
 java.lang.RuntimeException: Requesting the next InputSplit failed.
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88)
   at 
 org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337)
   at 
 org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.NullPointerException
   at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106)
   at 
 org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301)
   at 
 org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83)
   ... 4 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143814
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/FloatParser.java ---
@@ -102,6 +111,10 @@ public static final float parseField(byte[] bytes, int 
startPos, int length, cha
}

String str = new String(bytes, startPos, i);
+   int len = str.length();
+   if(len  str.trim().length()) {
--- End diff --

See other comment on `String.trim()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514093#comment-14514093
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143814
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/FloatParser.java ---
@@ -102,6 +111,10 @@ public static final float parseField(byte[] bytes, int 
startPos, int length, cha
}

String str = new String(bytes, startPos, i);
+   int len = str.length();
+   if(len  str.trim().length()) {
--- End diff --

See other comment on `String.trim()`


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514092#comment-14514092
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143809
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/FloatParser.java ---
@@ -41,6 +41,15 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, F
}

String str = new String(bytes, startPos, i-startPos);
+   int len = str.length();
+   if (len == 0) {
+   setErrorState(ParseErrorState.EMPTY_STRING);
+   return -1;
+   }
+   if(len  str.trim().length()) {
--- End diff --

See other comment on `String.trim()`


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143800
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/DoubleParser.java ---
@@ -42,6 +42,15 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, D
}

String str = new String(bytes, startPos, i-startPos);
+   int len = str.length();
+   if (len == 0) {
+   setErrorState(ParseErrorState.EMPTY_STRING);
+   return -1;
+   }
+   if(len  str.trim().length()) {
--- End diff --

See other comment on `String.trim()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514091#comment-14514091
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143800
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/DoubleParser.java ---
@@ -42,6 +42,15 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, D
}

String str = new String(bytes, startPos, i-startPos);
+   int len = str.length();
+   if (len == 0) {
+   setErrorState(ParseErrorState.EMPTY_STRING);
+   return -1;
+   }
+   if(len  str.trim().length()) {
--- End diff --

See other comment on `String.trim()`


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143809
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/FloatParser.java ---
@@ -41,6 +41,15 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, F
}

String str = new String(bytes, startPos, i-startPos);
+   int len = str.length();
+   if (len == 0) {
+   setErrorState(ParseErrorState.EMPTY_STRING);
+   return -1;
+   }
+   if(len  str.trim().length()) {
--- End diff --

See other comment on `String.trim()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514087#comment-14514087
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143622
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/CsvInputFormatTest.java 
---
@@ -353,6 +354,99 @@ public void testIntegerFieldsl() throws IOException {
assertEquals(Integer.valueOf(888), result.f2);
assertEquals(Integer.valueOf(999), result.f3);
assertEquals(Integer.valueOf(000), result.f4);
+
+   result = format.nextRecord(result);
+   assertNull(result);
+   assertTrue(format.reachedEnd());
+   }
+   catch (Exception ex) {
+   fail(Test failed due to a  + ex.getClass().getName() 
+ :  + ex.getMessage());
+   }
+   }
+
+   @Test
+   public void testEmptyFields() throws IOException {
+   try {
+   final String fileContent = |0|0|0|0\n +
+   1||1|1|1|\n +
+   2|2| |2|2|\n +
+   3 |3|3|  |3|\n +
+   4|4|4|4| |\n;
+   final FileInputSplit split = 
createTempFile(fileContent);
+
+   final TupleTypeInfoTuple5Short, Integer, Long, Float, 
Double typeInfo =
+   
TupleTypeInfo.getBasicTupleTypeInfo(Short.class, Integer.class, Long.class, 
Float.class, Double.class);
+   final CsvInputFormatTuple5Short, Integer, Long, 
Float, Double format = new CsvInputFormatTuple5Short, Integer, Long, Float, 
Double(PATH, typeInfo);
+
+   format.setFieldDelimiter(|);
+
+   format.configure(new Configuration());
+   format.open(split);
+
+   Tuple5Short, Integer, Long, Float, Double result = 
new Tuple5Short, Integer, Long, Float, Double();
+
+   try {
+   result = format.nextRecord(result);
+   fail(Empty String Parse Exception was not 
thrown! (ShortParser));
+   } catch (ParseException e) {}
+   try {
+   result = format.nextRecord(result);
+   fail(Empty String Parse Exception was not 
thrown! (IntegerParser));
+   } catch (ParseException e) {}
+   try {
+   result = format.nextRecord(result);
+   fail(Empty String Parse Exception was not 
thrown! (LongParser));
+   } catch (ParseException e) {}
+   try {
+   result = format.nextRecord(result);
--- End diff --

Doesn't this call fail because of the tailing whitespace in the `short` 
field?


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143622
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/CsvInputFormatTest.java 
---
@@ -353,6 +354,99 @@ public void testIntegerFieldsl() throws IOException {
assertEquals(Integer.valueOf(888), result.f2);
assertEquals(Integer.valueOf(999), result.f3);
assertEquals(Integer.valueOf(000), result.f4);
+
+   result = format.nextRecord(result);
+   assertNull(result);
+   assertTrue(format.reachedEnd());
+   }
+   catch (Exception ex) {
+   fail(Test failed due to a  + ex.getClass().getName() 
+ :  + ex.getMessage());
+   }
+   }
+
+   @Test
+   public void testEmptyFields() throws IOException {
+   try {
+   final String fileContent = |0|0|0|0\n +
+   1||1|1|1|\n +
+   2|2| |2|2|\n +
+   3 |3|3|  |3|\n +
+   4|4|4|4| |\n;
+   final FileInputSplit split = 
createTempFile(fileContent);
+
+   final TupleTypeInfoTuple5Short, Integer, Long, Float, 
Double typeInfo =
+   
TupleTypeInfo.getBasicTupleTypeInfo(Short.class, Integer.class, Long.class, 
Float.class, Double.class);
+   final CsvInputFormatTuple5Short, Integer, Long, 
Float, Double format = new CsvInputFormatTuple5Short, Integer, Long, Float, 
Double(PATH, typeInfo);
+
+   format.setFieldDelimiter(|);
+
+   format.configure(new Configuration());
+   format.open(split);
+
+   Tuple5Short, Integer, Long, Float, Double result = 
new Tuple5Short, Integer, Long, Float, Double();
+
+   try {
+   result = format.nextRecord(result);
+   fail(Empty String Parse Exception was not 
thrown! (ShortParser));
+   } catch (ParseException e) {}
+   try {
+   result = format.nextRecord(result);
+   fail(Empty String Parse Exception was not 
thrown! (IntegerParser));
+   } catch (ParseException e) {}
+   try {
+   result = format.nextRecord(result);
+   fail(Empty String Parse Exception was not 
thrown! (LongParser));
+   } catch (ParseException e) {}
+   try {
+   result = format.nextRecord(result);
--- End diff --

Doesn't this call fail because of the tailing whitespace in the `short` 
field?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-1937) Cannot create SparseVector with only one non-zero element.


 [ 
https://issues.apache.org/jira/browse/FLINK-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-1937:


Assignee: Till Rohrmann

 Cannot create SparseVector with only one non-zero element.
 --

 Key: FLINK-1937
 URL: https://issues.apache.org/jira/browse/FLINK-1937
 Project: Flink
  Issue Type: Bug
  Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Till Rohrmann
  Labels: ML

 I tried creating SparseVector with only one non-zero element. But I couldn't 
 create it. Following code causes the problem.
 {code}
 val vec2 = SparseVector.fromCOO(3, (1, 1))
 {code}
 I got a compile error following:
 {code:none}
 Error:(60, 29) overloaded method value fromCOO with alternatives:
   (size: Int,entries: Iterable[(Int, 
 Double)])org.apache.flink.ml.math.SparseVector and
   (size: Int,entries: (Int, Double)*)org.apache.flink.ml.math.SparseVector
  cannot be applied to (Int, (Int, Int))
 val vec2 = SparseVector.fromCOO(3, (1, 1))
 ^
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (FLINK-1937) Cannot create SparseVector with only one non-zero element.


 [ 
https://issues.apache.org/jira/browse/FLINK-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-1937:
-
Comment: was deleted

(was: Hi [~chiwanpark],

the problem is that you're giving a tuple of (Int, Int) to the function 
{{fromCOO}} which expects a tuple of (Int, Double). Creating the 
{{SparseVector}} with
{code}
val vec2 = SparseVector.fromCoo(3, (1, 1.0))
{code}
should fix your problem.

The underlying problem is that the Scala compiler cannot cast a tuple of (Int, 
Int) to (Int, Double) even though Int values are a subset of Double. We could 
add methods which do this for us, though.)

 Cannot create SparseVector with only one non-zero element.
 --

 Key: FLINK-1937
 URL: https://issues.apache.org/jira/browse/FLINK-1937
 Project: Flink
  Issue Type: Bug
  Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Till Rohrmann
  Labels: ML

 I tried creating SparseVector with only one non-zero element. But I couldn't 
 create it. Following code causes the problem.
 {code}
 val vec2 = SparseVector.fromCOO(3, (1, 1))
 {code}
 I got a compile error following:
 {code:none}
 Error:(60, 29) overloaded method value fromCOO with alternatives:
   (size: Int,entries: Iterable[(Int, 
 Double)])org.apache.flink.ml.math.SparseVector and
   (size: Int,entries: (Int, Double)*)org.apache.flink.ml.math.SparseVector
  cannot be applied to (Int, (Int, Int))
 val vec2 = SparseVector.fromCOO(3, (1, 1))
 ^
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1911) DataStream and DataSet projection is out of sync


[ 
https://issues.apache.org/jira/browse/FLINK-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513964#comment-14513964
 ] 

ASF GitHub Bot commented on FLINK-1911:
---

GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/630

[FLINK-1911] [streaming] Streaming projection without types

Since the DataSet projection has been reworked to not require the 
.types(...) call the Streaming and Batch methods were out of sync.
So, the streaming API projection was modified accordingly.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1911

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/630.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #630


commit a36e7bfe8cf5d5a949714de215aaabf03494f62d
Author: szape nemderogator...@gmail.com
Date:   2015-04-20T14:53:46Z

[FLINK-1911] [streaming] Working streaming projection prototype without 
types




 DataStream and DataSet projection is out of sync
 

 Key: FLINK-1911
 URL: https://issues.apache.org/jira/browse/FLINK-1911
 Project: Flink
  Issue Type: Bug
  Components: Java API, Streaming
Reporter: Gyula Fora
Assignee: Péter Szabó

 Since the DataSet projection has been reworked to not require the .types(...) 
 call the Streaming and Batch methods are out of sync.
 The streaming api projection needs to be modified accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1923] Replaces asynchronous logging wit...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/628#issuecomment-96622572
  
I am curious, why did you rewrite the TaskManager? I thought that one was 
logging synchronously already.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143763
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/DoubleParser.java ---
@@ -103,6 +112,10 @@ public static final double parseField(byte[] bytes, 
int startPos, int length, ch
}

String str = new String(bytes, startPos, i);
+   int len = str.length();
+   if(len  str.trim().length()) {
--- End diff --

`String.trim()` creates a new String object. 
Checking if the first or last character of the String is a whitespace is 
probably more efficient.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0


[ 
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514090#comment-14514090
 ] 

ASF GitHub Bot commented on FLINK-1820:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29143763
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/DoubleParser.java ---
@@ -103,6 +112,10 @@ public static final double parseField(byte[] bytes, 
int startPos, int length, ch
}

String str = new String(bytes, startPos, i);
+   int len = str.length();
+   if(len  str.trim().length()) {
--- End diff --

`String.trim()` creates a new String object. 
Checking if the first or last character of the String is a whitespace is 
probably more efficient.


 Bug in DoubleParser and FloatParser - empty String is not casted to 0
 -

 Key: FLINK-1820
 URL: https://issues.apache.org/jira/browse/FLINK-1820
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0, 0.9, 0.8.1
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Critical
 Fix For: 0.9


 Hi,
 I found the bug, when I wanted to read a csv file, which had a line like:
 ||\n
 If I treat it as a Tuple2Long,Long, I get as expected a tuple (0L,0L).
 But if I want to read it into a Double-Tuple or a Float-Tuple, I get the 
 following error:
 java.lang.AssertionError: Test failed due to a 
 org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
 ParserError NUMERIC_VALUE_FORMAT_ERROR 
 This error can be solved by adding an additional condition for empty strings 
 in the FloatParser / DoubleParser.
 We definitely need the CSVReader to be able to read empty values.
 I can fix it like described if there are no better ideas :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29144391
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/IntParser.java ---
@@ -38,19 +38,28 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, I
 
final int delimLimit = limit-delimiter.length+1;
 
+   if (bytes.length == 0) {
--- End diff --

This check is not strictly necessary, IMO.
`bytes` is a larger byte array which is reused by the calling 
`GenericCsvInputFormat`.

To reduce the processing overhead of each field, I would omit the check 
(here and in the Long and Short parsers)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-1949) YARNSessionFIFOITCase sometimes fails to detect when the detached session finishes

Robert Metzger created FLINK-1949:
-

 Summary: YARNSessionFIFOITCase sometimes fails to detect when the 
detached session finishes
 Key: FLINK-1949
 URL: https://issues.apache.org/jira/browse/FLINK-1949
 Project: Flink
  Issue Type: Bug
  Components: Tests, YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger


{code}
10:32:24,393 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase  
 - CLI Frontend has returned, so the job is running
10:32:24,398 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase  
 - waiting for the job with appId application_1430130687160_0003 to finish
10:32:24,629 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase  
 - The job has finished. TaskManager output file found 
/home/travis/build/tillrohrmann/flink/flink-yarn-tests/../flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1430130687160_0003/container_1430130687160_0003_01_02/taskmanager-stdout.log
10:32:24,630 WARN  org.apache.flink.yarn.YARNSessionFIFOITCase  
 - Error while detached yarn session was running
java.lang.AssertionError: Expected string '(all,2)' not found in string ''
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.flink.yarn.YARNSessionFIFOITCase.testDetachedPerJobYarnClusterInternal(YARNSessionFIFOITCase.java:504)
at 
org.apache.flink.yarn.YARNSessionFIFOITCase.testDetachedPerJobYarnClusterWithStreamingJob(YARNSessionFIFOITCase.java:563)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}
https://flink.a.o.uce.east.s3.amazonaws.com/travis-artifacts/tillrohrmann/flink/442/442.5.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1945) Make python tests less verbose

2015-04-27 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514117#comment-14514117
 ] 

Chesnay Schepler commented on FLINK-1945:
-

There are no print statements, all output is either 
a) job progression: 04/26/2015 15:07:09 MapPartition (PythonMapPartition)(1/1) 
switched to SCHEDULED 
b) job output: String successful! (yes, this is actually the output of the job)

I'll still try to get rid of the logging though, there was recently a similar 
issue so it shouldn't be too hard to fix.

 Make python tests less verbose
 --

 Key: FLINK-1945
 URL: https://issues.apache.org/jira/browse/FLINK-1945
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Chesnay Schepler
Priority: Minor

 Currently, the python tests print a lot of log messages to stdout. 
 Furthermore there seems to be some println statements which clutter the 
 console output. I think that these log messages are not required for the 
 tests and thus should be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1933] Add distance measure interface an...

2015-04-27 Thread chiwanpark

GitHub user chiwanpark opened a pull request:

https://github.com/apache/flink/pull/629

[FLINK-1933] Add distance measure interface and basic implementation to 
machine learning library

This PR contains following changes:

* Add `dot` method and `magnitude` method.
* Add `DistanceMeasure` trait.
* Add 7 basic implementation of `DistanceMeasure`.
* Add tests for above changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chiwanpark/flink FLINK-1933

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/629.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #629


commit 3fac0ff339de93ab1c3b3582924af75a1e6057ea
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-04-24T03:50:28Z

[FLINK-1933] [ml] Add dot product and magnitude into Vector

commit c8f940c2439f754ef0e640b5440507bce4b859d2
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-04-27T09:48:56Z

[FLINK-1933] [ml] Add distance measure interface and basic implementation 
to machine learning library




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1924] Minor Refactoring

2015-04-27 Thread mxm

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/616#issuecomment-96617964
  
@aljoscha Alright, will merge in a bit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1924) [Py] Refactor a few minor things


[ 
https://issues.apache.org/jira/browse/FLINK-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513949#comment-14513949
 ] 

ASF GitHub Bot commented on FLINK-1924:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/616#issuecomment-96617964
  
@aljoscha Alright, will merge in a bit.


 [Py] Refactor a few minor things
 

 Key: FLINK-1924
 URL: https://issues.apache.org/jira/browse/FLINK-1924
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Affects Versions: 0.9
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Trivial
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD


[ 
https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513978#comment-14513978
 ] 

ASF GitHub Bot commented on FLINK-1925:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/622#discussion_r29140570
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
 ---
@@ -137,7 +137,7 @@ extends Actor with ActorLogMessages with ActorLogging {
   protected val resources = 
HardwareDescription.extractFromSystem(memoryManager.getMemorySize)
 
   /** Registry of all tasks currently executed by this TaskManager */
-  protected val runningTasks = 
scala.collection.mutable.HashMap[ExecutionAttemptID, Task]()
+  protected val runningTasks = 
scala.collection.concurrent.TrieMap[ExecutionAttemptID, Task]()
--- End diff --

I have recently read (I think in the Databricks Scala guide) that they 
discourage the Scala concurrent package, because of bugs. How about using the 
java concurrent HashMap? That one has pretty good performance seems to work 
reliably.


 Split SubmitTask method up into two phases: Receive TDD and instantiation of 
 TDD
 

 Key: FLINK-1925
 URL: https://issues.apache.org/jira/browse/FLINK-1925
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann

 A user reported that a job times out while submitting tasks to the 
 TaskManager. The reason is that the JobManager expects a TaskOperationResult 
 response upon submitting a task to the TM. The TM downloads then the required 
 jars from the JM which blocks the actor thread and can take a very long time 
 if many TMs download from the JM. Due to this, the SubmitTask future throws a 
 TimeOutException.
 A possible solution could be that the TM eagerly acknowledges the reception 
 of the SubmitTask message and executes the task initialization within a 
 future. The future will upon completion send a UpdateTaskExecutionState 
 message to the JM which switches the state of the task from deploying to 
 running. This means that the handler of SubmitTask future in {{Execution}} 
 won't change the state of the task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD

[
https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513993#comment-14513993
]

ASF GitHub Bot commented on FLINK-1925:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/622#issuecomment-96627457

Looks god modulo some comments. The most critical being the concurrent Map
one.

Split SubmitTask method up into two phases: Receive TDD and instantiation of
TDD

Key: FLINK-1925
URL: https://issues.apache.org/jira/browse/FLINK-1925
Project: Flink
Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann

A user reported that a job times out while submitting tasks to the
TaskManager. The reason is that the JobManager expects a TaskOperationResult
response upon submitting a task to the TM. The TM downloads then the required
jars from the JM which blocks the actor thread and can take a very long time
if many TMs download from the JM. Due to this, the SubmitTask future throws a
TimeOutException.
A possible solution could be that the TM eagerly acknowledges the reception
of the SubmitTask message and executes the task initialization within a
future. The future will upon completion send a UpdateTaskExecutionState
message to the JM which switches the state of the task from deploying to
running. This means that the handler of SubmitTask future in {{Execution}}
won't change the state of the task.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1925] Fixes blocking method submitTask ...

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/622#issuecomment-96627457
  
Looks god modulo some comments. The most critical being the concurrent Map 
one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-1948) Add manual task slot handling for streaming jobs

2015-04-27 Thread Gyula Fora (JIRA)

Gyula Fora created FLINK-1948:
-

 Summary: Add manual task slot handling for streaming jobs
 Key: FLINK-1948
 URL: https://issues.apache.org/jira/browse/FLINK-1948
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora


Currently, all stream operators are automatically assigned to the same task 
sharing group, and the user has no control over this setting. We should add a 
way to manually affect the way operators are allocated to task manager slots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1937) Cannot create SparseVector with only one non-zero element.


[ 
https://issues.apache.org/jira/browse/FLINK-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514095#comment-14514095
 ] 

Till Rohrmann commented on FLINK-1937:
--

Hi [~chiwanpark],

the problem is that you're giving a tuple of (Int, Int) to the function 
{{fromCOO}} which expects a tuple of (Int, Double). Creating the 
{{SparseVector}} with
{code}
val vec2 = SparseVector.fromCoo(3, (1, 1.0))
{code}
should fix your problem.

The underlying problem is that the Scala compiler cannot cast a tuple of (Int, 
Int) to (Int, Double) even though Int values are a subset of Double. We could 
add methods which do this for us, though.

 Cannot create SparseVector with only one non-zero element.
 --

 Key: FLINK-1937
 URL: https://issues.apache.org/jira/browse/FLINK-1937
 Project: Flink
  Issue Type: Bug
  Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Till Rohrmann
  Labels: ML

 I tried creating SparseVector with only one non-zero element. But I couldn't 
 create it. Following code causes the problem.
 {code}
 val vec2 = SparseVector.fromCOO(3, (1, 1))
 {code}
 I got a compile error following:
 {code:none}
 Error:(60, 29) overloaded method value fromCOO with alternatives:
   (size: Int,entries: Iterable[(Int, 
 Double)])org.apache.flink.ml.math.SparseVector and
   (size: Int,entries: (Int, Double)*)org.apache.flink.ml.math.SparseVector
  cannot be applied to (Int, (Int, Int))
 val vec2 = SparseVector.fromCOO(3, (1, 1))
 ^
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1950) Increase default heap cutoff ratio from 20% to 30% and move default value to ConfigConstants

Robert Metzger created FLINK-1950:
-

 Summary: Increase default heap cutoff ratio from 20% to 30% and 
move default value to ConfigConstants
 Key: FLINK-1950
 URL: https://issues.apache.org/jira/browse/FLINK-1950
 Project: Flink
  Issue Type: Task
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Priority: Blocker


It seems that too many production users are facing issues with YARN killing 
containers due to resource overusage.

We can mitigate the issue by using only 70% of the specified memory for the 
heap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1950) Increase default heap cutoff ratio from 20% to 30% and move default value to ConfigConstants


[ 
https://issues.apache.org/jira/browse/FLINK-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514232#comment-14514232
 ] 

Robert Metzger commented on FLINK-1950:
---

Spark is removing 10%, but at least 384 mb.
They got to the value experimentally (thats my approach as well ;) 

https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala#L92

 Increase default heap cutoff ratio from 20% to 30% and move default value to 
 ConfigConstants
 

 Key: FLINK-1950
 URL: https://issues.apache.org/jira/browse/FLINK-1950
 Project: Flink
  Issue Type: Task
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Priority: Blocker

 It seems that too many production users are facing issues with YARN killing 
 containers due to resource overusage.
 We can mitigate the issue by using only 70% of the specified memory for the 
 heap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1820] CSVReader: In case of an empty st...

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/566#discussion_r29147205
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/types/parser/IntParser.java ---
@@ -38,19 +38,28 @@ public int parseField(byte[] bytes, int startPos, int 
limit, byte[] delimiter, I
 
final int delimLimit = limit-delimiter.length+1;
 
+   if (bytes.length == 0) {
--- End diff --

I see. This is probably because an empty test string causes the test to 
call the parser with an 0-length array.
We could add a dedicated `testEmptyField` test method to the 
`ParserTestBase` and remove the empty Strings from the set of invalid inputs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1950) Increase default heap cutoff ratio from 20% to 30% and move default value to ConfigConstants

2015-04-27 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514215#comment-14514215
 ] 

Stephan Ewen commented on FLINK-1950:
-

20% and 30% are both rather random magic numbers. Is there a better way to do 
this?

 Increase default heap cutoff ratio from 20% to 30% and move default value to 
 ConfigConstants
 

 Key: FLINK-1950
 URL: https://issues.apache.org/jira/browse/FLINK-1950
 Project: Flink
  Issue Type: Task
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Priority: Blocker

 It seems that too many production users are facing issues with YARN killing 
 containers due to resource overusage.
 We can mitigate the issue by using only 70% of the specified memory for the 
 heap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1820) Bug in DoubleParser and FloatParser - empty String is not casted to 0