[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132128284
  
+1 for having two different methods by return type but we need more 
comments from @tillrohrmann, @thvasilo or other people because I'm not sure 
this is best approach.

Would be okay the method names are `createContinuousHistogram` and 
`createCategoricalHistogram` if we decide create two methods?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2448) registerCacheFile fails with MultipleProgramsTestbase

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700983#comment-14700983
 ] 

ASF GitHub Bot commented on FLINK-2448:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1031#issuecomment-132136444
  
I am not sure this is solving it the right way. I find it more intuitive if 
cache files stay registered. They are part of the environment and should be 
sticky, like configuration settings as well (they don't get lost after 
execution).

We could add a method to clear registered cache files. But if the problem 
is really only the MultiplProgramsTest base, then this should be fixed to 
return a new ExecutionEnvironment from a (the) TestEnvironmentFactory every 
time you call `getExecutionEnvironment()`.


 registerCacheFile fails with MultipleProgramsTestbase
 -

 Key: FLINK-2448
 URL: https://issues.apache.org/jira/browse/FLINK-2448
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Chesnay Schepler
Priority: Minor

 When trying to register a file using a constant name an expection is thrown 
 saying the file was already cached.
 This is probably because the same environment is reused, and the cacheFile 
 entries are not cleared between runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2448]Clear cache file list in Execution...

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1031#issuecomment-132136444
  
I am not sure this is solving it the right way. I find it more intuitive if 
cache files stay registered. They are part of the environment and should be 
sticky, like configuration settings as well (they don't get lost after 
execution).

We could add a method to clear registered cache files. But if the problem 
is really only the MultiplProgramsTest base, then this should be fixed to 
return a new ExecutionEnvironment from a (the) TestEnvironmentFactory every 
time you call `getExecutionEnvironment()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2448]Clear cache file list in Execution...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1031#issuecomment-132140770
  
I'm not sure if I can form an opinion on this. I personally haven't used 
them at all with Flink, only with Hadoop. A call to clear cache files certainly 
makes sense though. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2448) registerCacheFile fails with MultipleProgramsTestbase

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701000#comment-14701000
 ] 

ASF GitHub Bot commented on FLINK-2448:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1031#issuecomment-132140770
  
I'm not sure if I can form an opinion on this. I personally haven't used 
them at all with Flink, only with Hadoop. A call to clear cache files certainly 
makes sense though. 


 registerCacheFile fails with MultipleProgramsTestbase
 -

 Key: FLINK-2448
 URL: https://issues.apache.org/jira/browse/FLINK-2448
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Chesnay Schepler
Priority: Minor

 When trying to register a file using a constant name an expection is thrown 
 saying the file was already cached.
 This is probably because the same environment is reused, and the cacheFile 
 entries are not cleared between runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread thvasilo
Github user thvasilo commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132146208
  
Sounds good. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [flink-2532]fix the function name and the vari...

2015-08-18 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/1025#issuecomment-132153447
  
Hey @Rucongzhang,

As this is your first contribution to Flink and the PR was issued before a 
relevant notice on the developer mailing list that advises against merging it 
[1], I am merging this - but I highlight the fact the Flink community seems to 
have an agreement that these changes should be mostly rejected in the future.

I strongly agree with @fhueske's comment on the mailing list [2] that you 
find plenty of already open relevant issues on JIRA, I would suggest to pick up 
one of those. If an issue you like happens to be a bit unclear the community is 
happy to give you directions.

Happy coding Flink. :smile: 

[1] 
https://mail-archives.apache.org/mod_mbox/flink-dev/201508.mbox/%3CCANC1h_sTnwSunrjsBY%2BVHrBM2wimkAY7HDKfjM7ZWSx6pYRDFg%40mail.gmail.com%3E
[2] 
https://mail-archives.apache.org/mod_mbox/flink-dev/201508.mbox/%3CCAAdrtT02%3Dk6WNKVeROpqzdW78zn%2Bhv-OX8BD-uQ17S7vtV3QbQ%40mail.gmail.com%3E


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132170286
  
@thvasilo , @chiwanpark , I've made the required changes. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2529) fix on some unused code for flink-runtime

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701077#comment-14701077
 ] 

ASF GitHub Bot commented on FLINK-2529:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/1022#issuecomment-132172991
  
@StephanEwen 
Thank you!
Do you mean that this PR would not be closed and I can push future checks 
in this?


 fix on some unused code for flink-runtime
 -

 Key: FLINK-2529
 URL: https://issues.apache.org/jira/browse/FLINK-2529
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 In file BlobServer.java, I found the Thread.currentThread() will never return 
 null in my learned knowledge.
 So I think shutdownHook != null“ is not necessary in code 'if (shutdownHook 
 != null  shutdownHook != Thread.currentThread())';
 And I am not complete sure.
 Maybe I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132118352
  
What about having two different functions? One for Discrete and one for 
continuous? 
Or perhaps just one for the Continuous as that is more likely to be used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-2540) LocalBufferPool.requestBuffer gets into infinite loop

2015-08-18 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2540:
---
Component/s: Core

 LocalBufferPool.requestBuffer gets into infinite loop
 -

 Key: FLINK-2540
 URL: https://issues.apache.org/jira/browse/FLINK-2540
 Project: Flink
  Issue Type: Bug
  Components: Core
Reporter: Gabor Gevay

 I'm trying to run a complicated computation that looks like this: [1].
 One of the DataSource-Filter-Map chains finishes fine, but the other one 
 freezes. Debugging shows that it is spinning in the while loop in 
 LocalBufferPool.requestBuffer.
 askToRecycle is false. Both numberOfRequestedMemorySegments and 
 currentPoolSize is 128, so it never goes into that if either.
 This is a stack trace: [2]
 And here is the code, if you would like to run it: [3]. Unfortunately, I 
 can't make it more minimal, becuase if I remove some operators, the problem 
 disappears. The class to start is malom.Solver. (On first run, it calculates 
 some lookuptables for a few minutes, and puts them into /tmp/movegen)
 [1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt
 [2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt
 [3] https://github.com/ggevay/flink/tree/deadlock-malom



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2529][runtime]remove some unused code

2015-08-18 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/1022#issuecomment-132172991
  
@StephanEwen 
Thank you!
Do you mean that this PR would not be closed and I can push future checks 
in this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132134747
  
Okay. That sounds good enough. Can we make a final decision then?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2529][runtime]remove some unused code

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1022#issuecomment-132138231
  
The change in the execution class is good. I'll leave the other check in, 
as it is a good sanity check against future mistakes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2540) LocalBufferPool.requestBuffer gets into infinite loop

2015-08-18 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2540:
--

 Summary: LocalBufferPool.requestBuffer gets into infinite loop
 Key: FLINK-2540
 URL: https://issues.apache.org/jira/browse/FLINK-2540
 Project: Flink
  Issue Type: Bug
Reporter: Gabor Gevay


I'm trying to run a complicated computation that looks like this: [1].
One of the DataSource-Filter-Map chains finishes fine, but the other one 
freezes. Debugging shows that it is spinning in the while loop in 
LocalBufferPool.requestBuffer.

askToRecycle is false. Both numberOfRequestedMemorySegments and currentPoolSize 
is 128, so it never goes into that if either.

This is a stack trace: [2]

And here is the code, if you would like to run it: [3]. Unfortunately, I can't 
make it more minimal, becuase if I remove some operators, the problem 
disappears. The class to start is malom.Solver. (On first run, it calculates 
some lookuptables for a few minutes, and puts them into /tmp/movegen)

[1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt
[2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt
[3] https://github.com/ggevay/flink/tree/deadlock-malom



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1901) Create sample operator for Dataset

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701026#comment-14701026
 ] 

ASF GitHub Bot commented on FLINK-1901:
---

Github user ChengXiangLi commented on the pull request:

https://github.com/apache/flink/pull/949#issuecomment-132152536
  
Hi, @tillrohrmann , do you have sometime to continue to review this PR and 
help to push its progress?


 Create sample operator for Dataset
 --

 Key: FLINK-1901
 URL: https://issues.apache.org/jira/browse/FLINK-1901
 Project: Flink
  Issue Type: Improvement
  Components: Core
Reporter: Theodore Vasiloudis
Assignee: Chengxiang Li

 In order to be able to implement Stochastic Gradient Descent and a number of 
 other machine learning algorithms we need to have a way to take a random 
 sample from a Dataset.
 We need to be able to sample with or without replacement from the Dataset, 
 choose the relative or exact size of the sample, set a seed for 
 reproducibility, and support sampling within iterations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1901] [core] Create sample operator for...

2015-08-18 Thread ChengXiangLi
Github user ChengXiangLi commented on the pull request:

https://github.com/apache/flink/pull/949#issuecomment-132152536
  
Hi, @tillrohrmann , do you have sometime to continue to review this PR and 
help to push its progress?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2366) HA Without ZooKeeper

2015-08-18 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700876#comment-14700876
 ] 

Stephan Ewen commented on FLINK-2366:
-

The leader election service is pluggable in Flink's runtime. Should be 
relatively easy to plug in other services.

Let's see what users request...

 HA Without ZooKeeper
 

 Key: FLINK-2366
 URL: https://issues.apache.org/jira/browse/FLINK-2366
 Project: Flink
  Issue Type: Improvement
Reporter: Suminda Dharmasena
Priority: Minor

 Please provide a way to do HA without having to use ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1962] Add Gelly Scala API v2

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132115823
  
Looks good to merge except one cosmetic issue. Awesome job! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700903#comment-14700903
 ] 

ASF GitHub Bot commented on FLINK-1962:
---

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132115823
  
Looks good to merge except one cosmetic issue. Awesome job! :)


 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: PJ Van Aeken





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2477) Add test for SocketClientSink

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700841#comment-14700841
 ] 

ASF GitHub Bot commented on FLINK-2477:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-132100631
  
@StephanEwen 
Hi, I`m really very sorry to bother you again.
I just wonder whether this can be merged or I need to do some fix more or 
close this PR?



 Add test for SocketClientSink
 -

 Key: FLINK-2477
 URL: https://issues.apache.org/jira/browse/FLINK-2477
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 Add some tests for SocketClientSink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...

2015-08-18 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-132100631
  
@StephanEwen 
Hi, I`m really very sorry to bother you again.
I just wonder whether this can be merged or I need to do some fix more or 
close this PR?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2521] [tests] Adds automatic test name ...

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1015#issuecomment-132118469
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1962] Add Gelly Scala API v2

2015-08-18 Thread PieterJanVanAeken
Github user PieterJanVanAeken commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132118620
  
I fixed the unneccessary comment and did a quick rebase to keep up with 
master branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2521) Add automatic test name logging for tests

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700918#comment-14700918
 ] 

ASF GitHub Bot commented on FLINK-2521:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1015#issuecomment-132118469
  
+1


 Add automatic test name logging for tests
 -

 Key: FLINK-2521
 URL: https://issues.apache.org/jira/browse/FLINK-2521
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Minor

 When running tests on travis the Flink components log to a file. This is 
 helpful in case of a failed test to retrieve the error. However, the log does 
 not contain the test name and the reason for the failure. Therefore it is 
 difficult to find the log output which corresponds to the failed test.
 It would be nice to automatically add the test case information to the log. 
 This would ease the debugging process big time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2366) HA Without ZooKeeper

2015-08-18 Thread Suminda Dharmasena (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700776#comment-14700776
 ] 

Suminda Dharmasena commented on FLINK-2366:
---

Can you consider something more simpler. Have the ability for the end user to 
choose. E.g. the user can choose https://www.serfdom.io/  
https://www.consul.io/ in case does not want use ZK. Also 
https://github.com/kuujo/copycat.

 HA Without ZooKeeper
 

 Key: FLINK-2366
 URL: https://issues.apache.org/jira/browse/FLINK-2366
 Project: Flink
  Issue Type: Improvement
Reporter: Suminda Dharmasena
Priority: Minor

 Please provide a way to do HA without having to use ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132116059
  
I think that we can merge this PR after we decide the return type of 
`createHistogram` method. Any other points seem okay.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2539) More unified code style for Scala code

2015-08-18 Thread Chiwan Park (JIRA)
Chiwan Park created FLINK-2539:
--

 Summary: More unified code style for Scala code
 Key: FLINK-2539
 URL: https://issues.apache.org/jira/browse/FLINK-2539
 Project: Flink
  Issue Type: Improvement
Reporter: Chiwan Park
Priority: Minor


We need more specific code style guide for Scala to prevent code style 
differentiation. We discussed about this in [mailing 
list|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Code-style-guideline-for-Scala-td7526.html].

Following works are needed:
* Providing code formatting configuration for Eclipse and IntelliJ IDEA
* More detail description in 
[wiki|https://cwiki.apache.org/confluence/display/FLINK/Coding+Guidelines+for+Scala]
 and [Coding Guidelines|http://flink.apache.org/coding-guidelines.html] in 
homepage
* More strict rules in scala style checker (We need to discuss more)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1962] Add Gelly Scala API v2

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132119301
  
All right, let's merge this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700927#comment-14700927
 ] 

ASF GitHub Bot commented on FLINK-1962:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132119301
  
All right, let's merge this!


 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: PJ Van Aeken





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2521) Add automatic test name logging for tests

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700834#comment-14700834
 ] 

ASF GitHub Bot commented on FLINK-2521:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1015#issuecomment-132097606
  
Good idea @StephanEwen, I'll make the log statements more prominent and 
then I'll merge the PR.


 Add automatic test name logging for tests
 -

 Key: FLINK-2521
 URL: https://issues.apache.org/jira/browse/FLINK-2521
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Minor

 When running tests on travis the Flink components log to a file. This is 
 helpful in case of a failed test to retrieve the error. However, the log does 
 not contain the test name and the reason for the failure. Therefore it is 
 difficult to find the log output which corresponds to the failed test.
 It would be nice to automatically add the test case information to the log. 
 This would ease the debugging process big time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2521] [tests] Adds automatic test name ...

2015-08-18 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1015#issuecomment-132097606
  
Good idea @StephanEwen, I'll make the log statements more prominent and 
then I'll merge the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700901#comment-14700901
 ] 

ASF GitHub Bot commented on FLINK-1962:
---

Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/1004#discussion_r37273914
  
--- Diff: 
flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala
 ---
@@ -0,0 +1,735 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.scala
+
+import org.apache.flink.api.common.functions.{FilterFunction, MapFunction}
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.{tuple = jtuple}
+import org.apache.flink.api.scala._
+import org.apache.flink.graph._
+import org.apache.flink.graph.gsa.{ApplyFunction, GSAConfiguration, 
GatherFunction, SumFunction}
+import org.apache.flink.graph.spargel.{MessagingFunction, 
VertexCentricConfiguration, VertexUpdateFunction}
+import org.apache.flink.{graph = jg}
+
+import _root_.scala.collection.JavaConverters._
+import _root_.scala.reflect.ClassTag
+
+object Graph {
+  def fromDataSet[K: TypeInformation : ClassTag, VV: TypeInformation : 
ClassTag, EV:
+  TypeInformation : ClassTag](vertices: DataSet[Vertex[K, VV]], edges: 
DataSet[Edge[K, EV]],
+  env: ExecutionEnvironment): Graph[K, VV, EV] 
= {
+wrapGraph(jg.Graph.fromDataSet[K, VV, EV](vertices.javaSet, 
edges.javaSet, env.getJavaEnv))
+  }
+
+  def fromCollection[K: TypeInformation : ClassTag, VV: TypeInformation : 
ClassTag, EV:
+  TypeInformation : ClassTag](vertices: Seq[Vertex[K, VV]], edges: 
Seq[Edge[K, EV]], env:
+  ExecutionEnvironment): Graph[K, VV, EV] = {
+wrapGraph(jg.Graph.fromCollection[K, VV, 
EV](vertices.asJavaCollection, edges
+  .asJavaCollection, env.getJavaEnv))
+  }
+}
+
+/**
+ * Represents a graph consisting of {@link Edge edges} and {@link Vertex 
vertices}.
+ * @param jgraph the underlying java api Graph.
+ * @tparam K the key type for vertex and edge identifiers
+ * @tparam VV the value type for vertices
+ * @tparam EV the value type for edges
+ * @see org.apache.flink.graph.Edge
+ * @see org.apache.flink.graph.Vertex
+ */
+final class Graph[K: TypeInformation : ClassTag, VV: TypeInformation : 
ClassTag, EV:
+TypeInformation : ClassTag](jgraph: jg.Graph[K, VV, EV]) {
+
+  private[flink] def getWrappedGraph = jgraph
+
+
+  private[flink] def clean[F : AnyRef](f: F, checkSerializable: Boolean = 
true): F = {
+if (jgraph.getContext.getConfig.isClosureCleanerEnabled) {
+  ClosureCleaner.clean(f, checkSerializable)
+}
+ClosureCleaner.ensureSerializable(f)
+f
+  }
+
+  /**
+   * @return the vertex DataSet.
+   */
+  def getVertices = wrap(jgraph.getVertices)
+
+  /**
+   * @return the edge DataSet.
+   */
+  def getEdges = wrap(jgraph.getEdges)
+
+  /**
+   * @return the vertex DataSet as Tuple2.
+   */
+  def getVerticesAsTuple2(): DataSet[(K, VV)] = {
+wrap(jgraph.getVerticesAsTuple2).map(jtuple = (jtuple.f0, jtuple.f1))
+  }
+
+  /**
+   * @return the edge DataSet as Tuple3.
+   */
+  def getEdgesAsTuple3(): DataSet[(K, K, EV)] = {
+wrap(jgraph.getEdgesAsTuple3).map(jtuple = (jtuple.f0, jtuple.f1, 
jtuple.f2))
+  }
+
+  /**
+   * Apply a function to the attribute of each vertex in the graph.
+   *
+   * @param mapper the map function to apply.
+   * @return a new graph
+   */
+  def mapVertices[NV: TypeInformation : ClassTag](mapper: 
MapFunction[Vertex[K, VV], NV]):
+  Graph[K, NV, EV] = {
+new Graph[K, NV, EV](jgraph.mapVertices[NV](
+  mapper,
+  createTypeInformation[Vertex[K, NV]]
+))
+  }
+
+  /**
+   * Apply a 

[GitHub] flink pull request: [FLINK-1962] Add Gelly Scala API v2

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/1004#discussion_r37273914
  
--- Diff: 
flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala
 ---
@@ -0,0 +1,735 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.scala
+
+import org.apache.flink.api.common.functions.{FilterFunction, MapFunction}
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.{tuple = jtuple}
+import org.apache.flink.api.scala._
+import org.apache.flink.graph._
+import org.apache.flink.graph.gsa.{ApplyFunction, GSAConfiguration, 
GatherFunction, SumFunction}
+import org.apache.flink.graph.spargel.{MessagingFunction, 
VertexCentricConfiguration, VertexUpdateFunction}
+import org.apache.flink.{graph = jg}
+
+import _root_.scala.collection.JavaConverters._
+import _root_.scala.reflect.ClassTag
+
+object Graph {
+  def fromDataSet[K: TypeInformation : ClassTag, VV: TypeInformation : 
ClassTag, EV:
+  TypeInformation : ClassTag](vertices: DataSet[Vertex[K, VV]], edges: 
DataSet[Edge[K, EV]],
+  env: ExecutionEnvironment): Graph[K, VV, EV] 
= {
+wrapGraph(jg.Graph.fromDataSet[K, VV, EV](vertices.javaSet, 
edges.javaSet, env.getJavaEnv))
+  }
+
+  def fromCollection[K: TypeInformation : ClassTag, VV: TypeInformation : 
ClassTag, EV:
+  TypeInformation : ClassTag](vertices: Seq[Vertex[K, VV]], edges: 
Seq[Edge[K, EV]], env:
+  ExecutionEnvironment): Graph[K, VV, EV] = {
+wrapGraph(jg.Graph.fromCollection[K, VV, 
EV](vertices.asJavaCollection, edges
+  .asJavaCollection, env.getJavaEnv))
+  }
+}
+
+/**
+ * Represents a graph consisting of {@link Edge edges} and {@link Vertex 
vertices}.
+ * @param jgraph the underlying java api Graph.
+ * @tparam K the key type for vertex and edge identifiers
+ * @tparam VV the value type for vertices
+ * @tparam EV the value type for edges
+ * @see org.apache.flink.graph.Edge
+ * @see org.apache.flink.graph.Vertex
+ */
+final class Graph[K: TypeInformation : ClassTag, VV: TypeInformation : 
ClassTag, EV:
+TypeInformation : ClassTag](jgraph: jg.Graph[K, VV, EV]) {
+
+  private[flink] def getWrappedGraph = jgraph
+
+
+  private[flink] def clean[F : AnyRef](f: F, checkSerializable: Boolean = 
true): F = {
+if (jgraph.getContext.getConfig.isClosureCleanerEnabled) {
+  ClosureCleaner.clean(f, checkSerializable)
+}
+ClosureCleaner.ensureSerializable(f)
+f
+  }
+
+  /**
+   * @return the vertex DataSet.
+   */
+  def getVertices = wrap(jgraph.getVertices)
+
+  /**
+   * @return the edge DataSet.
+   */
+  def getEdges = wrap(jgraph.getEdges)
+
+  /**
+   * @return the vertex DataSet as Tuple2.
+   */
+  def getVerticesAsTuple2(): DataSet[(K, VV)] = {
+wrap(jgraph.getVerticesAsTuple2).map(jtuple = (jtuple.f0, jtuple.f1))
+  }
+
+  /**
+   * @return the edge DataSet as Tuple3.
+   */
+  def getEdgesAsTuple3(): DataSet[(K, K, EV)] = {
+wrap(jgraph.getEdgesAsTuple3).map(jtuple = (jtuple.f0, jtuple.f1, 
jtuple.f2))
+  }
+
+  /**
+   * Apply a function to the attribute of each vertex in the graph.
+   *
+   * @param mapper the map function to apply.
+   * @return a new graph
+   */
+  def mapVertices[NV: TypeInformation : ClassTag](mapper: 
MapFunction[Vertex[K, VV], NV]):
+  Graph[K, NV, EV] = {
+new Graph[K, NV, EV](jgraph.mapVertices[NV](
+  mapper,
+  createTypeInformation[Vertex[K, NV]]
+))
+  }
+
+  /**
+   * Apply a function to the attribute of each vertex in the graph.
+   *
+   * @param fun the map function to apply.
+   * @return a new graph
+   */
+  def mapVertices[NV: TypeInformation : ClassTag](fun: Vertex[K, VV] = 
NV): Graph[K, NV, EV] = {

[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700920#comment-14700920
 ] 

ASF GitHub Bot commented on FLINK-1962:
---

Github user PieterJanVanAeken commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132118620
  
I fixed the unneccessary comment and did a quick rebase to keep up with 
master branch.


 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: PJ Van Aeken





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-132118720
  
Looks good and CI passes.

Will merge this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2477) Add test for SocketClientSink

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700922#comment-14700922
 ] 

ASF GitHub Bot commented on FLINK-2477:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-132118720
  
Looks good and CI passes.

Will merge this!


 Add test for SocketClientSink
 -

 Key: FLINK-2477
 URL: https://issues.apache.org/jira/browse/FLINK-2477
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 Add some tests for SocketClientSink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2540) LocalBufferPool.requestBuffer gets into infinite loop

2015-08-18 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2540:
---
Fix Version/s: 0.9.1
   0.10

 LocalBufferPool.requestBuffer gets into infinite loop
 -

 Key: FLINK-2540
 URL: https://issues.apache.org/jira/browse/FLINK-2540
 Project: Flink
  Issue Type: Bug
  Components: Core
Reporter: Gabor Gevay
 Fix For: 0.10, 0.9.1


 I'm trying to run a complicated computation that looks like this: [1].
 One of the DataSource-Filter-Map chains finishes fine, but the other one 
 freezes. Debugging shows that it is spinning in the while loop in 
 LocalBufferPool.requestBuffer.
 askToRecycle is false. Both numberOfRequestedMemorySegments and 
 currentPoolSize is 128, so it never goes into that if either.
 This is a stack trace: [2]
 And here is the code, if you would like to run it: [3]. Unfortunately, I 
 can't make it more minimal, becuase if I remove some operators, the problem 
 disappears. The class to start is malom.Solver. (On first run, it calculates 
 some lookuptables for a few minutes, and puts them into /tmp/movegen)
 [1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt
 [2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt
 [3] https://github.com/ggevay/flink/tree/deadlock-malom



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2542) It should be documented that it is required from a join key to override hashCode(), when it is not a POJO

2015-08-18 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2542:
--

 Summary: It should be documented that it is required from a join 
key to override hashCode(), when it is not a POJO
 Key: FLINK-2542
 URL: https://issues.apache.org/jira/browse/FLINK-2542
 Project: Flink
  Issue Type: Bug
  Components: Gelly, Java API
Reporter: Gabor Gevay
Priority: Minor
 Fix For: 0.10, 0.9.1


If the join key is not a POJO, and does not override hashCode, then the join 
silently fails (produces empty output). I don't see this documented anywhere.

The Gelly documentation should also have this info separately, because it does 
joins internally on the vertex IDs, but the user might not know this, or might 
not look at the join documentation when using Gelly.

Here is an example code:

{noformat}
public static class ID implements ComparableID {
public long foo;

//no default ctor -- not a POJO

public ID(long foo) {
this.foo = foo;
}

@Override
public int compareTo(ID o) {
return ((Long)foo).compareTo(o.foo);
}

@Override
public boolean equals(Object o0) {
if(o0 instanceof ID) {
ID o = (ID)o0;
return foo == o.foo;
} else {
return false;
}
}

@Override
public int hashCode() {
return 42;
}
}


public static void main(String[] args) throws Exception {
ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();

DataSetTuple2ID, Long inDegrees = env.fromElements(Tuple2.of(new 
ID(123l), 4l));
DataSetTuple2ID, Long outDegrees = env.fromElements(Tuple2.of(new 
ID(123l), 5l));

DataSetTuple3ID, Long, Long degrees = inDegrees.join(outDegrees, 
JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0).equalTo(0)
.with(new FlatJoinFunctionTuple2ID, Long, Tuple2ID, 
Long, Tuple3ID, Long, Long() {
@Override
public void join(Tuple2ID, Long first, 
Tuple2ID, Long second, CollectorTuple3ID, Long, Long out) {
out.collect(new Tuple3ID, Long, 
Long(first.f0, first.f1, second.f1));
}

}).withForwardedFieldsFirst(f0;f1).withForwardedFieldsSecond(f1);

System.out.println(degrees count:  + degrees.count());
}
{noformat}


This prints 1, but if I comment out the hashCode, it prints 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2540) LocalBufferPool.requestBuffer gets into infinite loop

2015-08-18 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701179#comment-14701179
 ] 

Stephan Ewen commented on FLINK-2540:
-

I put Ufuk as responsible for that (hop that's okay).

He knows that code best...

 LocalBufferPool.requestBuffer gets into infinite loop
 -

 Key: FLINK-2540
 URL: https://issues.apache.org/jira/browse/FLINK-2540
 Project: Flink
  Issue Type: Bug
  Components: Core
Reporter: Gabor Gevay
Assignee: Ufuk Celebi
Priority: Blocker
 Fix For: 0.10, 0.9.1


 I'm trying to run a complicated computation that looks like this: [1].
 One of the DataSource-Filter-Map chains finishes fine, but the other one 
 freezes. Debugging shows that it is spinning in the while loop in 
 LocalBufferPool.requestBuffer.
 askToRecycle is false. Both numberOfRequestedMemorySegments and 
 currentPoolSize is 128, so it never goes into that if either.
 This is a stack trace: [2]
 And here is the code, if you would like to run it: [3]. Unfortunately, I 
 can't make it more minimal, becuase if I remove some operators, the problem 
 disappears. The class to start is malom.Solver. (On first run, it calculates 
 some lookuptables for a few minutes, and puts them into /tmp/movegen)
 [1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt
 [2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt
 [3] https://github.com/ggevay/flink/tree/deadlock-malom



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2540) LocalBufferPool.requestBuffer gets into infinite loop

2015-08-18 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-2540:

Priority: Blocker  (was: Major)

 LocalBufferPool.requestBuffer gets into infinite loop
 -

 Key: FLINK-2540
 URL: https://issues.apache.org/jira/browse/FLINK-2540
 Project: Flink
  Issue Type: Bug
  Components: Core
Reporter: Gabor Gevay
Priority: Blocker
 Fix For: 0.10, 0.9.1


 I'm trying to run a complicated computation that looks like this: [1].
 One of the DataSource-Filter-Map chains finishes fine, but the other one 
 freezes. Debugging shows that it is spinning in the while loop in 
 LocalBufferPool.requestBuffer.
 askToRecycle is false. Both numberOfRequestedMemorySegments and 
 currentPoolSize is 128, so it never goes into that if either.
 This is a stack trace: [2]
 And here is the code, if you would like to run it: [3]. Unfortunately, I 
 can't make it more minimal, becuase if I remove some operators, the problem 
 disappears. The class to start is malom.Solver. (On first run, it calculates 
 some lookuptables for a few minutes, and puts them into /tmp/movegen)
 [1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt
 [2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt
 [3] https://github.com/ggevay/flink/tree/deadlock-malom



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2540) LocalBufferPool.requestBuffer gets into infinite loop

2015-08-18 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-2540:

Assignee: Ufuk Celebi

 LocalBufferPool.requestBuffer gets into infinite loop
 -

 Key: FLINK-2540
 URL: https://issues.apache.org/jira/browse/FLINK-2540
 Project: Flink
  Issue Type: Bug
  Components: Core
Reporter: Gabor Gevay
Assignee: Ufuk Celebi
Priority: Blocker
 Fix For: 0.10, 0.9.1


 I'm trying to run a complicated computation that looks like this: [1].
 One of the DataSource-Filter-Map chains finishes fine, but the other one 
 freezes. Debugging shows that it is spinning in the while loop in 
 LocalBufferPool.requestBuffer.
 askToRecycle is false. Both numberOfRequestedMemorySegments and 
 currentPoolSize is 128, so it never goes into that if either.
 This is a stack trace: [2]
 And here is the code, if you would like to run it: [3]. Unfortunately, I 
 can't make it more minimal, becuase if I remove some operators, the problem 
 disappears. The class to start is malom.Solver. (On first run, it calculates 
 some lookuptables for a few minutes, and puts them into /tmp/movegen)
 [1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt
 [2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt
 [3] https://github.com/ggevay/flink/tree/deadlock-malom



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment

2015-08-18 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701196#comment-14701196
 ] 

Márton Balassi commented on FLINK-2508:
---

Thanks for both resports, Aljoscha. I am looking into how the batch side works 
to get the same behavior for streaming.

 Confusing sharing of StreamExecutionEnvironment
 ---

 Key: FLINK-2508
 URL: https://issues.apache.org/jira/browse/FLINK-2508
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Márton Balassi
 Fix For: 0.10


 In the {{StreamExecutionEnvironment}}, the environment is once created and 
 then shared with a static variable to all successive calls to 
 {{getExecutionEnvironment()}}. But it can be overridden by calls to 
 {{createLocalEnvironment()}} and {{createRemoteEnvironment()}}.
 This seems a bit un-intuitive, and probably creates confusion when 
 dispatching multiple streaming jobs from within the same JVM.
 Why is it even necessary to cache the current execution environment?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132187018
  
Olrite. I'll be back in a few. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2488) Expose attemptNumber in RuntimeContext

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701218#comment-14701218
 ] 

ASF GitHub Bot commented on FLINK-2488:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1026#discussion_r37294536
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/TaskRuntimeInfo.java ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.common;
+
+/**
+ * Encapsulation of runtime information for a Task, including Task Name, 
sub-task index, etc.
+ *
+ */
+public class TaskRuntimeInfo implements java.io.Serializable {
+
+   /** Task Name */
+   private final String taskName;
+
+   /** Index of this task in the parallel task group. Goes from 0 to 
{@link #numParallelTasks} - 1 */
+   private final int subTaskIndex;
+
+   /** Number of parallel running instances of this task */
+   private final int numParallelTasks;
+
+   /** Attempt number of this task */
+   private final int attemptNumber;
+
+   /** Task Name along with information on index of task in group */
+   private transient final String taskNameWithSubTaskIndex;
--- End diff --

If this is transient, it is null after deserialization. Either make it 
non-transient, or re-compute it after deserialization.


 Expose attemptNumber in RuntimeContext
 --

 Key: FLINK-2488
 URL: https://issues.apache.org/jira/browse/FLINK-2488
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Sachin Goel
Priority: Minor

 It would be nice to expose the attemptNumber of a task in the 
 {{RuntimeContext}}. 
 This would allow user code to behave differently in restart scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1026#discussion_r37294536
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/TaskRuntimeInfo.java ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.common;
+
+/**
+ * Encapsulation of runtime information for a Task, including Task Name, 
sub-task index, etc.
+ *
+ */
+public class TaskRuntimeInfo implements java.io.Serializable {
+
+   /** Task Name */
+   private final String taskName;
+
+   /** Index of this task in the parallel task group. Goes from 0 to 
{@link #numParallelTasks} - 1 */
+   private final int subTaskIndex;
+
+   /** Number of parallel running instances of this task */
+   private final int numParallelTasks;
+
+   /** Attempt number of this task */
+   private final int attemptNumber;
+
+   /** Task Name along with information on index of task in group */
+   private transient final String taskNameWithSubTaskIndex;
--- End diff --

If this is transient, it is null after deserialization. Either make it 
non-transient, or re-compute it after deserialization.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132183402
  
Is there the valid reason to have multiple names ('discrete histogram', 
'categorical histogram')? I think that to avoid confusion, we need to unify the 
names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2543) State handling does not support deserializing classes through the UserCodeClassloader

2015-08-18 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2543:
-

 Summary: State handling does not support deserializing classes 
through the UserCodeClassloader
 Key: FLINK-2543
 URL: https://issues.apache.org/jira/browse/FLINK-2543
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9, 0.10
Reporter: Robert Metzger
Priority: Blocker


The current implementation of the state checkpointing does not support custom 
classes, because the UserCodeClassLoader is not used to deserialize the state.

{code}
Error: java.lang.RuntimeException: Failed to deserialize state handle and setup 
initial operator state.
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:544)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: 
com.ottogroup.bi.searchlab.searchsessionizer.OperatorState
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:626)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at 
org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:63)
at 
org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:33)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreInitialState(AbstractUdfStreamOperator.java:83)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.setInitialState(StreamTask.java:276)
at 
org.apache.flink.runtime.state.StateUtils.setOperatorState(StateUtils.java:51)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:541)
{code}

The issue has been reported by a user: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Custom-Class-for-state-checkpointing-td2415.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2543) State handling does not support deserializing classes through the UserCodeClassloader

2015-08-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-2543:
-

Assignee: Robert Metzger

 State handling does not support deserializing classes through the 
 UserCodeClassloader
 -

 Key: FLINK-2543
 URL: https://issues.apache.org/jira/browse/FLINK-2543
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9, 0.10
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Blocker

 The current implementation of the state checkpointing does not support custom 
 classes, because the UserCodeClassLoader is not used to deserialize the state.
 {code}
 Error: java.lang.RuntimeException: Failed to deserialize state handle and 
 setup initial operator state.
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:544)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ClassNotFoundException: 
 com.ottogroup.bi.searchlab.searchsessionizer.OperatorState
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:348)
 at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:626)
 at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
 at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
 at 
 org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:63)
 at 
 org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:33)
 at 
 org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreInitialState(AbstractUdfStreamOperator.java:83)
 at 
 org.apache.flink.streaming.runtime.tasks.StreamTask.setInitialState(StreamTask.java:276)
 at 
 org.apache.flink.runtime.state.StateUtils.setOperatorState(StateUtils.java:51)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:541)
 {code}
 The issue has been reported by a user: 
 http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Custom-Class-for-state-checkpointing-td2415.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment

2015-08-18 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701175#comment-14701175
 ] 

Aljoscha Krettek commented on FLINK-2508:
-

Another problem I discovered is that TestStreamEnvironment is polluting other 
tests. For example, in TranslationTest the test uses 
{{StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();}}. When I execute this 
standalone (or in an IDE) the call will return a LocalStreamEnvironment. When 
other tests are run before this they set some TestStreamEnvironment as context 
environment and then this call will pick up that TestStreamEnvironment. When 
this happens the execute call will also execute the leftover (lingering) 
operators from other tests. 

 Confusing sharing of StreamExecutionEnvironment
 ---

 Key: FLINK-2508
 URL: https://issues.apache.org/jira/browse/FLINK-2508
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Márton Balassi
 Fix For: 0.10


 In the {{StreamExecutionEnvironment}}, the environment is once created and 
 then shared with a static variable to all successive calls to 
 {{getExecutionEnvironment()}}. But it can be overridden by calls to 
 {{createLocalEnvironment()}} and {{createRemoteEnvironment()}}.
 This seems a bit un-intuitive, and probably creates confusion when 
 dispatching multiple streaming jobs from within the same JVM.
 Why is it even necessary to cache the current execution environment?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2521] [tests] Adds automatic test name ...

2015-08-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1015


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132188989
  
@chiwanpark , modified the names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132198739
  
Rebased to the latest master. Can be merged now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2488) Expose attemptNumber in RuntimeContext

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701217#comment-14701217
 ] 

ASF GitHub Bot commented on FLINK-2488:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132198739
  
Rebased to the latest master. Can be merged now.


 Expose attemptNumber in RuntimeContext
 --

 Key: FLINK-2488
 URL: https://issues.apache.org/jira/browse/FLINK-2488
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Sachin Goel
Priority: Minor

 It would be nice to expose the attemptNumber of a task in the 
 {{RuntimeContext}}. 
 This would allow user code to behave differently in restart scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2541) TypeComparator creation fails for T2T1byte[], byte[]

2015-08-18 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-2541:
---

 Summary: TypeComparator creation fails for T2T1byte[], byte[]
 Key: FLINK-2541
 URL: https://issues.apache.org/jira/browse/FLINK-2541
 Project: Flink
  Issue Type: Bug
Reporter: Chesnay Schepler


When running the following job as a JavaProgramTest:
{code}
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
DataSetTuple2Tuple1byte[], byte[] data = env.fromElements(
new Tuple2Tuple1byte[], byte[](
new Tuple1byte[](new byte[]{1, 2}), 
new byte[]{1, 2, 3}),
new Tuple2Tuple1byte[], byte[](
new Tuple1byte[](new byte[]{1, 2}), 
new byte[]{1, 2, 3}));

data.groupBy(f0.f0)
.reduceGroup(new DummyReduceTuple2Tuple1byte[], byte[]())
.print();
{code}
with DummyReduce defined as
{code}
public static class DummyReduceIN implements GroupReduceFunctionIN, IN {
@Override
public void reduce(IterableIN values, CollectorIN out) throws Exception {
for (IN value : values) {
out.collect(value);
}}}
{code}

i encountered the following exception:

Tuple comparator creation has a bug
java.lang.IllegalArgumentException: Tuple comparator creation has a bug
at 
org.apache.flink.api.java.typeutils.TupleTypeInfo.getNewComparator(TupleTypeInfo.java:131)
at 
org.apache.flink.api.common.typeutils.CompositeType.createComparator(CompositeType.java:133)
at 
org.apache.flink.api.common.typeutils.CompositeType.createComparator(CompositeType.java:122)
at 
org.apache.flink.api.common.operators.base.GroupReduceOperatorBase.getTypeComparator(GroupReduceOperatorBase.java:155)
at 
org.apache.flink.api.common.operators.base.GroupReduceOperatorBase.executeOnCollections(GroupReduceOperatorBase.java:184)
at 
org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:236)
at 
org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:143)
at 
org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:215)
at 
org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:143)
at 
org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:125)
at 
org.apache.flink.api.common.operators.CollectionExecutor.executeDataSink(CollectionExecutor.java:176)
at 
org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:152)
at 
org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:125)
at 
org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:109)
at 
org.apache.flink.api.java.CollectionEnvironment.execute(CollectionEnvironment.java:33)
at 
org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:35)
at 
org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:30)
at org.apache.flink.api.java.DataSet.collect(DataSet.java:408)
at org.apache.flink.api.java.DataSet.print(DataSet.java:1349)
at 
org.apache.flink.languagebinding.api.java.python.AbstractPythonTest.testProgram(AbstractPythonTest.java:42)
at 
org.apache.flink.test.util.JavaProgramTestBase.testJobCollectionExecution(JavaProgramTestBase.java:226)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at 

[jira] [Updated] (FLINK-2541) TypeComparator creation fails for T2T1byte[], byte[]

2015-08-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-2541:
--
Component/s: Java API

 TypeComparator creation fails for T2T1byte[], byte[]
 

 Key: FLINK-2541
 URL: https://issues.apache.org/jira/browse/FLINK-2541
 Project: Flink
  Issue Type: Bug
  Components: Java API
Reporter: Chesnay Schepler

 When running the following job as a JavaProgramTest:
 {code}
 ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
 DataSetTuple2Tuple1byte[], byte[] data = env.fromElements(
   new Tuple2Tuple1byte[], byte[](
   new Tuple1byte[](new byte[]{1, 2}), 
   new byte[]{1, 2, 3}),
   new Tuple2Tuple1byte[], byte[](
   new Tuple1byte[](new byte[]{1, 2}), 
   new byte[]{1, 2, 3}));
 data.groupBy(f0.f0)
 .reduceGroup(new DummyReduceTuple2Tuple1byte[], byte[]())
 .print();
 {code}
 with DummyReduce defined as
 {code}
 public static class DummyReduceIN implements GroupReduceFunctionIN, IN {
 @Override
 public void reduce(IterableIN values, CollectorIN out) throws Exception {
   for (IN value : values) {
   out.collect(value);
   }}}
 {code}
 i encountered the following exception:
 Tuple comparator creation has a bug
 java.lang.IllegalArgumentException: Tuple comparator creation has a bug
   at 
 org.apache.flink.api.java.typeutils.TupleTypeInfo.getNewComparator(TupleTypeInfo.java:131)
   at 
 org.apache.flink.api.common.typeutils.CompositeType.createComparator(CompositeType.java:133)
   at 
 org.apache.flink.api.common.typeutils.CompositeType.createComparator(CompositeType.java:122)
   at 
 org.apache.flink.api.common.operators.base.GroupReduceOperatorBase.getTypeComparator(GroupReduceOperatorBase.java:155)
   at 
 org.apache.flink.api.common.operators.base.GroupReduceOperatorBase.executeOnCollections(GroupReduceOperatorBase.java:184)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:236)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:143)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:215)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:143)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:125)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.executeDataSink(CollectionExecutor.java:176)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:152)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:125)
   at 
 org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:109)
   at 
 org.apache.flink.api.java.CollectionEnvironment.execute(CollectionEnvironment.java:33)
   at 
 org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:35)
   at 
 org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:30)
   at org.apache.flink.api.java.DataSet.collect(DataSet.java:408)
   at org.apache.flink.api.java.DataSet.print(DataSet.java:1349)
   at 
 org.apache.flink.languagebinding.api.java.python.AbstractPythonTest.testProgram(AbstractPythonTest.java:42)
   at 
 org.apache.flink.test.util.JavaProgramTestBase.testJobCollectionExecution(JavaProgramTestBase.java:226)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 

[jira] [Commented] (FLINK-2521) Add automatic test name logging for tests

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1470#comment-1470
 ] 

ASF GitHub Bot commented on FLINK-2521:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1015


 Add automatic test name logging for tests
 -

 Key: FLINK-2521
 URL: https://issues.apache.org/jira/browse/FLINK-2521
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Minor

 When running tests on travis the Flink components log to a file. This is 
 helpful in case of a failed test to retrieve the error. However, the log does 
 not contain the test name and the reason for the failure. Therefore it is 
 difficult to find the log output which corresponds to the failed test.
 It would be nice to automatically add the test case information to the log. 
 This would ease the debugging process big time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2521) Add automatic test name logging for tests

2015-08-18 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-2521.

Resolution: Fixed

Added via 2f0412f163f4d37605188c8cc763111e0b51f0dc

 Add automatic test name logging for tests
 -

 Key: FLINK-2521
 URL: https://issues.apache.org/jira/browse/FLINK-2521
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Minor

 When running tests on travis the Flink components log to a file. This is 
 helpful in case of a failed test to retrieve the error. However, the log does 
 not contain the test name and the reason for the failure. Therefore it is 
 difficult to find the log output which corresponds to the failed test.
 It would be nice to automatically add the test case information to the log. 
 This would ease the debugging process big time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132186504
  
I'm also inclined to use Discrete.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132193111
  
They're not coupled at all. But both are related to statistics over data 
sets. This is why I combined them both in one. If you're wondering, there is a 
JIRA for the column wise statistics as well. [FLINK-2379].
Unless it is absolutely necessary, I'd like to keep them both in one. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2529][runtime]remove some unused code

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1022#issuecomment-132196596
  
I'll merge half of this pull request...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1994) Add different gain calculation schemes to SGD

2015-08-18 Thread Trevor Grant (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701200#comment-14701200
 ] 

Trevor Grant commented on FLINK-1994:
-

I more or less have this working with a few new optimizers based on sklearn- 
specifically the constant, optimal, and inverse scaling. 
Reading this paper to find others, as well as including bulleted ones in issue.

 Add different gain calculation schemes to SGD
 -

 Key: FLINK-1994
 URL: https://issues.apache.org/jira/browse/FLINK-1994
 Project: Flink
  Issue Type: Improvement
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Trevor Grant
Priority: Minor
  Labels: ML, Starter

 The current SGD implementation uses as gain for the weight updates the 
 formula {{stepsize/sqrt(iterationNumber)}}. It would be good to make the gain 
 calculation configurable and to provide different strategies for that. For 
 example:
 * stepsize/(1 + iterationNumber)
 * stepsize*(1 + regularization * stepsize * iterationNumber)^(-3/4)
 See also how to properly select the gains [1].
 Resources:
 [1] http://arxiv.org/pdf/1107.2490.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2529) fix on some unused code for flink-runtime

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701198#comment-14701198
 ] 

ASF GitHub Bot commented on FLINK-2529:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1022#issuecomment-132196596
  
I'll merge half of this pull request...


 fix on some unused code for flink-runtime
 -

 Key: FLINK-2529
 URL: https://issues.apache.org/jira/browse/FLINK-2529
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 In file BlobServer.java, I found the Thread.currentThread() will never return 
 null in my learned knowledge.
 So I think shutdownHook != null“ is not necessary in code 'if (shutdownHook 
 != null  shutdownHook != Thread.currentThread())';
 And I am not complete sure.
 Maybe I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132199820
  
Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2488) Expose attemptNumber in RuntimeContext

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701227#comment-14701227
 ] 

ASF GitHub Bot commented on FLINK-2488:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132199820
  
Fixed.


 Expose attemptNumber in RuntimeContext
 --

 Key: FLINK-2488
 URL: https://issues.apache.org/jira/browse/FLINK-2488
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Sachin Goel
Priority: Minor

 It would be nice to expose the attemptNumber of a task in the 
 {{RuntimeContext}}. 
 This would allow user code to behave differently in restart scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1994) Add different gain calculation schemes to SGD

2015-08-18 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701225#comment-14701225
 ] 

Till Rohrmann commented on FLINK-1994:
--

Sounds great [~rawkintrevo]. When it's ready, then you should open a PR against 
Flink's master branch on github. 

 Add different gain calculation schemes to SGD
 -

 Key: FLINK-1994
 URL: https://issues.apache.org/jira/browse/FLINK-1994
 Project: Flink
  Issue Type: Improvement
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Trevor Grant
Priority: Minor
  Labels: ML, Starter

 The current SGD implementation uses as gain for the weight updates the 
 formula {{stepsize/sqrt(iterationNumber)}}. It would be good to make the gain 
 calculation configurable and to provide different strategies for that. For 
 example:
 * stepsize/(1 + iterationNumber)
 * stepsize*(1 + regularization * stepsize * iterationNumber)^(-3/4)
 See also how to properly select the gains [1].
 Resources:
 [1] http://arxiv.org/pdf/1107.2490.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters

2015-08-18 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-2545:
-

 Summary: NegativeArraySizeException while creating hash table 
bloom filters
 Key: FLINK-2545
 URL: https://issues.apache.org/jira/browse/FLINK-2545
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: master
Reporter: Greg Hogan


The following exception occurred a second time when I immediately re-ran my 
application, though after recompiling and restarting Flink the subsequent 
execution ran without error.

java.lang.Exception: The data preparation for task '...' , caused an error: null
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465)
at 
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NegativeArraySizeException
at 
org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160)
at 
org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143)
at 
org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117)
at 
org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946)
at 
org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868)
at 
org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692)
at 
org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455)
at 
org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93)
at 
org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195)
at 
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459)
... 3 more




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters

2015-08-18 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701802#comment-14701802
 ] 

Stephan Ewen commented on FLINK-2545:
-

Thanks for reporting this!

As a quick fix, you can disable bloom filters by adding 
{{taskmanager.runtime.hashjoin-bloom-filters: false}} to the Flink config.

See here for a reference: 
https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#runtime-algorithms

Th bloom filters are a relatively new addition in 0.10.

 NegativeArraySizeException while creating hash table bloom filters
 --

 Key: FLINK-2545
 URL: https://issues.apache.org/jira/browse/FLINK-2545
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: master
Reporter: Greg Hogan

 The following exception occurred a second time when I immediately re-ran my 
 application, though after recompiling and restarting Flink the subsequent 
 execution ran without error.
 java.lang.Exception: The data preparation for task '...' , caused an error: 
 null
   at 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.NegativeArraySizeException
   at 
 org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160)
   at 
 org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143)
   at 
 org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117)
   at 
 org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946)
   at 
 org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868)
   at 
 org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692)
   at 
 org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455)
   at 
 org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93)
   at 
 org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195)
   at 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459)
   ... 3 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2488) Expose attemptNumber in RuntimeContext

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701814#comment-14701814
 ] 

ASF GitHub Bot commented on FLINK-2488:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132322001
  
Sorry to be picky again, but it would be good to harmonize the code style a 
bit - make it more consistent with the remaining code.

What struck me in particular the omission of space before the curly braces 
at the start of methods and other code blocks. All other parts of the code have 
that.

There are ongoing discussions about making the code style stricter (also 
with respect to white spaces), and this is much easier if new code follows this 
standard already. Otherwise, there'd going to be a lot of style adjustments 
necessary when introducing the check.


 Expose attemptNumber in RuntimeContext
 --

 Key: FLINK-2488
 URL: https://issues.apache.org/jira/browse/FLINK-2488
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Sachin Goel
Priority: Minor

 It would be nice to expose the attemptNumber of a task in the 
 {{RuntimeContext}}. 
 This would allow user code to behave differently in restart scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132322001
  
Sorry to be picky again, but it would be good to harmonize the code style a 
bit - make it more consistent with the remaining code.

What struck me in particular the omission of space before the curly braces 
at the start of methods and other code blocks. All other parts of the code have 
that.

There are ongoing discussions about making the code style stricter (also 
with respect to white spaces), and this is much easier if new code follows this 
standard already. Otherwise, there'd going to be a lot of style adjustments 
necessary when introducing the check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Path refactor

2015-08-18 Thread gallenvara
GitHub user gallenvara opened a pull request:

https://github.com/apache/flink/pull/1034

Path refactor



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gallenvara/flink path_refactor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1034.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1034


commit 526a50a4ba38ae84574c31a6e67b39ac14649d44
Author: gallenvara gallenv...@126.com
Date:   2015-08-18T10:31:40Z

[Flink-2077] Clean and refactor the Path class.

commit 4fb1b03f89d1a3a2793e957ff0132ba70bffb3a6
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T01:41:55Z

[Flink-2077] Clean and refactor the Path class.

commit 380ed45473f51ff2835bf8fcaa00e45c7b0b
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T02:46:46Z

[Flink-2077] Clean and refactor the Path class.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2077] [core] Rework Path class and add ...

2015-08-18 Thread gallenvara
GitHub user gallenvara opened a pull request:

https://github.com/apache/flink/pull/1035

[FLINK-2077] [core] Rework Path class and add extend support for Windows 
paths

The class org.apache.flink.core.fs.Path handles paths for Flink's 
FileInputFormat and FileOutputFormat. Over time, this class has become quite 
hard to read and modify.
This PR does some work on cleaning and refactoring.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gallenvara/flink path_refactor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1035.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1035


commit 526a50a4ba38ae84574c31a6e67b39ac14649d44
Author: gallenvara gallenv...@126.com
Date:   2015-08-18T10:31:40Z

[Flink-2077] Clean and refactor the Path class.

commit 4fb1b03f89d1a3a2793e957ff0132ba70bffb3a6
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T01:41:55Z

[Flink-2077] Clean and refactor the Path class.

commit 380ed45473f51ff2835bf8fcaa00e45c7b0b
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T02:46:46Z

[Flink-2077] Clean and refactor the Path class.

commit b7c770df6b2b255a30b5af5589544bf1597f32f5
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T03:22:41Z

[FLINK-2077] [core] Rework Path class and add extend support for Windows 
paths




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Issue Comment Deleted] (FLINK-2366) HA Without ZooKeeper

2015-08-18 Thread Suminda Dharmasena (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suminda Dharmasena updated FLINK-2366:
--
Comment: was deleted

(was: OK.

As long as it is *easily pluggable* it is OK. Also another established project: 
https://github.com/belaban/JGroups)

 HA Without ZooKeeper
 

 Key: FLINK-2366
 URL: https://issues.apache.org/jira/browse/FLINK-2366
 Project: Flink
  Issue Type: Improvement
Reporter: Suminda Dharmasena
Priority: Minor

 Please provide a way to do HA without having to use ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2366) HA Without ZooKeeper

2015-08-18 Thread Suminda Dharmasena (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702379#comment-14702379
 ] 

Suminda Dharmasena commented on FLINK-2366:
---

OK.

As long as it is *easily pluggable* it is OK. Also another established project: 
https://github.com/belaban/JGroups

 HA Without ZooKeeper
 

 Key: FLINK-2366
 URL: https://issues.apache.org/jira/browse/FLINK-2366
 Project: Flink
  Issue Type: Improvement
Reporter: Suminda Dharmasena
Priority: Minor

 Please provide a way to do HA without having to use ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2366) HA Without ZooKeeper

2015-08-18 Thread Suminda Dharmasena (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702380#comment-14702380
 ] 

Suminda Dharmasena commented on FLINK-2366:
---

OK.

As long as it is *easily pluggable* it is OK. Also another established project: 
https://github.com/belaban/JGroups

 HA Without ZooKeeper
 

 Key: FLINK-2366
 URL: https://issues.apache.org/jira/browse/FLINK-2366
 Project: Flink
  Issue Type: Improvement
Reporter: Suminda Dharmasena
Priority: Minor

 Please provide a way to do HA without having to use ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132439907
  
Including the collect step in the documentation seems weird, IMO. Since the 
return type is a `DataSet[Histogram]`, anyone familiar with flink will know to 
collect it first before operating on the element itself. 

Of course, one other option would be actually collect and return the 
histogram in the `create` functions, but we don't wanna do that in case it is 
used inside an iteration.
Should I include the collect step in the documentation then?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1901) Create sample operator for Dataset

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702304#comment-14702304
 ] 

ASF GitHub Bot commented on FLINK-1901:
---

Github user ChengXiangLi commented on a diff in the pull request:

https://github.com/apache/flink/pull/949#discussion_r37373497
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/operators/util/PoissonSampler.java
 ---
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.common.operators.util;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.math3.distribution.PoissonDistribution;
+
+import java.util.Iterator;
+
+/**
+ * A sampler implementation based on Poisson Distribution. While sampling 
elements with fraction and replacement,
+ * the selected number of each element follows a given poisson 
distribution, so we could use poisson
+ * distribution to generate random variables for sample.
+ *
+ * @param T The type of sample.
+ * @see a 
href=https://en.wikipedia.org/wiki/Poisson_distribution;https://en.wikipedia.org/wiki/Poisson_distribution/a
+ */
+public class PoissonSamplerT extends RandomSamplerT {
+   
+   private PoissonDistribution poissonDistribution;
+   private final double fraction;
+   
+   /**
+* Create a poisson sampler which would sample elements with 
replacement.
+*
+* @param fraction The expected count of each element.
+* @param seed Random number generator seed for internal 
PoissonDistribution.
+*/
+   public PoissonSampler(double fraction, long seed) {
+   Preconditions.checkArgument(fraction = 0, fraction should be 
positive.);
+   this.fraction = fraction;
+   if (this.fraction  0) {
+   this.poissonDistribution = new 
PoissonDistribution(fraction);
+   this.poissonDistribution.reseedRandomGenerator(seed);
+   }
+   }
+   
+   /**
+* Create a poisson sampler which would sample elements with 
replacement.
+*
+* @param fraction The expected count of each element.
+*/
+   public PoissonSampler(double fraction) {
+   Preconditions.checkArgument(fraction = 0, fraction should be 
non-negative.);
+   this.fraction = fraction;
+   if (this.fraction  0) {
+   this.poissonDistribution = new 
PoissonDistribution(fraction);
+   }
+   }
+   
+   /**
+* Sample the input elements, for each input element, generate its 
count with poisson distribution random variables generation.
+*
+* @param input Elements to be sampled.
+* @return The sampled result which is lazy computed upon input 
elements.
+*/
+   @Override
+   public IteratorT sample(final IteratorT input) {
+   if (fraction == 0) {
+   return EMPTY_ITERABLE;
+   }
+   
+   return new SampledIteratorT() {
+   T currentElement;
+   int currentCount = 0;
+   
+   @Override
+   public boolean hasNext() {
+   if (currentElement == null || currentCount == 
0) {
+   while (input.hasNext()) {
+   currentElement = input.next();
+   currentCount = 
poissonDistribution.sample();
+   if (currentCount  0) {
+   return true;
+   }
+   }
+   return false;
+   }
+   return true;
+   }
+   

[GitHub] flink pull request: [FLINK-1901] [core] Create sample operator for...

2015-08-18 Thread ChengXiangLi
Github user ChengXiangLi commented on a diff in the pull request:

https://github.com/apache/flink/pull/949#discussion_r37373497
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/operators/util/PoissonSampler.java
 ---
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.common.operators.util;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.math3.distribution.PoissonDistribution;
+
+import java.util.Iterator;
+
+/**
+ * A sampler implementation based on Poisson Distribution. While sampling 
elements with fraction and replacement,
+ * the selected number of each element follows a given poisson 
distribution, so we could use poisson
+ * distribution to generate random variables for sample.
+ *
+ * @param T The type of sample.
+ * @see a 
href=https://en.wikipedia.org/wiki/Poisson_distribution;https://en.wikipedia.org/wiki/Poisson_distribution/a
+ */
+public class PoissonSamplerT extends RandomSamplerT {
+   
+   private PoissonDistribution poissonDistribution;
+   private final double fraction;
+   
+   /**
+* Create a poisson sampler which would sample elements with 
replacement.
+*
+* @param fraction The expected count of each element.
+* @param seed Random number generator seed for internal 
PoissonDistribution.
+*/
+   public PoissonSampler(double fraction, long seed) {
+   Preconditions.checkArgument(fraction = 0, fraction should be 
positive.);
+   this.fraction = fraction;
+   if (this.fraction  0) {
+   this.poissonDistribution = new 
PoissonDistribution(fraction);
+   this.poissonDistribution.reseedRandomGenerator(seed);
+   }
+   }
+   
+   /**
+* Create a poisson sampler which would sample elements with 
replacement.
+*
+* @param fraction The expected count of each element.
+*/
+   public PoissonSampler(double fraction) {
+   Preconditions.checkArgument(fraction = 0, fraction should be 
non-negative.);
+   this.fraction = fraction;
+   if (this.fraction  0) {
+   this.poissonDistribution = new 
PoissonDistribution(fraction);
+   }
+   }
+   
+   /**
+* Sample the input elements, for each input element, generate its 
count with poisson distribution random variables generation.
+*
+* @param input Elements to be sampled.
+* @return The sampled result which is lazy computed upon input 
elements.
+*/
+   @Override
+   public IteratorT sample(final IteratorT input) {
+   if (fraction == 0) {
+   return EMPTY_ITERABLE;
+   }
+   
+   return new SampledIteratorT() {
+   T currentElement;
+   int currentCount = 0;
+   
+   @Override
+   public boolean hasNext() {
+   if (currentElement == null || currentCount == 
0) {
+   while (input.hasNext()) {
+   currentElement = input.next();
+   currentCount = 
poissonDistribution.sample();
+   if (currentCount  0) {
+   return true;
+   }
+   }
+   return false;
+   }
+   return true;
+   }
+   
+   @Override
+   public T next() {
+   T result = currentElement;
+   if (currentCount == 0) {
+   currentElement = 

[GitHub] flink pull request: Add Scala version of GraphAlgorithm

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/1033#discussion_r37376484
  
--- Diff: 
flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/GraphAlgorithm.scala
 ---
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.scala
+
+/**
+ * @tparam K key type
+ * @tparam VV vertex value type
+ * @tparam EV edge value type
+ */
+abstract class GraphAlgorithm[K, VV, EV] {
+
+  def run(graph: Graph[K, VV, EV]): Graph[K,VV,EV]
--- End diff --

Please add space after comma :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add Scala version of GraphAlgorithm

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/1033#discussion_r37376469
  
--- Diff: 
flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala
 ---
@@ -649,10 +650,13 @@ TypeInformation : ClassTag](jgraph: jg.Graph[K, VV, 
EV]) {
   jtuple.f1))
   }
 
-  def run(algorithm: GraphAlgorithm[K, VV, EV]) = {
+  def run(algorithm: JGraphAlgorithm[K, VV, EV]): Graph[K,VV,EV] = {
--- End diff --

Please add space after comma :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add Scala version of GraphAlgorithm

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/1033#discussion_r37376477
  
--- Diff: 
flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala
 ---
@@ -649,10 +650,13 @@ TypeInformation : ClassTag](jgraph: jg.Graph[K, VV, 
EV]) {
   jtuple.f1))
   }
 
-  def run(algorithm: GraphAlgorithm[K, VV, EV]) = {
+  def run(algorithm: JGraphAlgorithm[K, VV, EV]): Graph[K,VV,EV] = {
 wrapGraph(jgraph.run(algorithm))
   }
 
+  def run(algorithm: GraphAlgorithm[K,VV,EV]): Graph[K,VV,EV] = {
--- End diff --

Please add space after comma.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132430686
  
The only problem to merge this is invalid example in documentation. We need 
calling `collect()` method and `apply(0)` to execute statistics methods such as 
`quantile`, `sum`, ..., etc..

Other things seem okay. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2077) Rework Path class and add extend support for Windows paths

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702375#comment-14702375
 ] 

ASF GitHub Bot commented on FLINK-2077:
---

GitHub user gallenvara opened a pull request:

https://github.com/apache/flink/pull/1035

[FLINK-2077] [core] Rework Path class and add extend support for Windows 
paths

The class org.apache.flink.core.fs.Path handles paths for Flink's 
FileInputFormat and FileOutputFormat. Over time, this class has become quite 
hard to read and modify.
This PR does some work on cleaning and refactoring.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gallenvara/flink path_refactor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1035.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1035


commit 526a50a4ba38ae84574c31a6e67b39ac14649d44
Author: gallenvara gallenv...@126.com
Date:   2015-08-18T10:31:40Z

[Flink-2077] Clean and refactor the Path class.

commit 4fb1b03f89d1a3a2793e957ff0132ba70bffb3a6
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T01:41:55Z

[Flink-2077] Clean and refactor the Path class.

commit 380ed45473f51ff2835bf8fcaa00e45c7b0b
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T02:46:46Z

[Flink-2077] Clean and refactor the Path class.

commit b7c770df6b2b255a30b5af5589544bf1597f32f5
Author: gallenvara gallenv...@126.com
Date:   2015-08-19T03:22:41Z

[FLINK-2077] [core] Rework Path class and add extend support for Windows 
paths




 Rework Path class and add extend support for Windows paths
 --

 Key: FLINK-2077
 URL: https://issues.apache.org/jira/browse/FLINK-2077
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: GaoLun
Priority: Minor
  Labels: starter

 The class {{org.apache.flink.core.fs.Path}} handles paths for Flink's 
 {{FileInputFormat}} and {{FileOutputFormat}}. Over time, this class has 
 become quite hard to read and modify. 
 It would benefit from some cleaning and refactoring. Along with the 
 refactoring, support for Windows paths like {{//host/dir1/dir2}} could be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2488) Expose attemptNumber in RuntimeContext

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702524#comment-14702524
 ] 

ASF GitHub Bot commented on FLINK-2488:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132453057
  
@StephanEwen , I have fixed the styling issues as much as I could find out. 
Thanks for your patience. :)


 Expose attemptNumber in RuntimeContext
 --

 Key: FLINK-2488
 URL: https://issues.apache.org/jira/browse/FLINK-2488
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Sachin Goel
Priority: Minor

 It would be nice to expose the attemptNumber of a task in the 
 {{RuntimeContext}}. 
 This would allow user code to behave differently in restart scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132453057
  
@StephanEwen , I have fixed the styling issues as much as I could find out. 
Thanks for your patience. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2030][ml]Data Set Statistics and Histog...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/861#issuecomment-132449023
  
Hi @chiwanpark , I have modified the documentation to include the collect 
step. 
I have also made a few modification to how the histogram is created by 
using a `MapPartition` function instead of `Map`. Also, for more consistency, 
changed the signature of `DataStats` creation to return `DataSet[DataStats]` 
instead of `DataStats`.
@thvasilo , I have integrated the statistics part with this. I hope that is 
alright. Please have a look at `MLUtils.scala` to see the usage. If it's not, 
I'd be happy to remove it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...

2015-08-18 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132449408
  
Sure thing. I'll fix it. IntelliJ re-formats used to take care of such 
things, but I've stopped using it recently as it was messing up scala code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2488) Expose attemptNumber in RuntimeContext

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702475#comment-14702475
 ] 

ASF GitHub Bot commented on FLINK-2488:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1026#issuecomment-132449408
  
Sure thing. I'll fix it. IntelliJ re-formats used to take care of such 
things, but I've stopped using it recently as it was messing up scala code.


 Expose attemptNumber in RuntimeContext
 --

 Key: FLINK-2488
 URL: https://issues.apache.org/jira/browse/FLINK-2488
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Sachin Goel
Priority: Minor

 It would be nice to expose the attemptNumber of a task in the 
 {{RuntimeContext}}. 
 This would allow user code to behave differently in restart scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2532) The function name and the variable name are same, it is confusing

2015-08-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi closed FLINK-2532.
-
   Resolution: Fixed
Fix Version/s: 0.10

Via a9d55d3

 The function name and the variable name are same, it is confusing
 -

 Key: FLINK-2532
 URL: https://issues.apache.org/jira/browse/FLINK-2532
 Project: Flink
  Issue Type: Bug
Reporter: zhangrucong
Priority: Minor
 Fix For: 0.10


 In class StreamWindow, in function split, there is a list variable also 
 called split. The readable is poor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2477) Add test for SocketClientSink

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701357#comment-14701357
 ] 

ASF GitHub Bot commented on FLINK-2477:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/977


 Add test for SocketClientSink
 -

 Key: FLINK-2477
 URL: https://issues.apache.org/jira/browse/FLINK-2477
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 Add some tests for SocketClientSink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701355#comment-14701355
 ] 

ASF GitHub Bot commented on FLINK-1962:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1004


 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: PJ Van Aeken





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2529][runtime]remove some unused code

2015-08-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1022


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2529) fix on some unused code for flink-runtime

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701356#comment-14701356
 ] 

ASF GitHub Bot commented on FLINK-2529:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1022


 fix on some unused code for flink-runtime
 -

 Key: FLINK-2529
 URL: https://issues.apache.org/jira/browse/FLINK-2529
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 In file BlobServer.java, I found the Thread.currentThread() will never return 
 null in my learned knowledge.
 So I think shutdownHook != null“ is not necessary in code 'if (shutdownHook 
 != null  shutdownHook != Thread.currentThread())';
 And I am not complete sure.
 Maybe I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1962] Add Gelly Scala API v2

2015-08-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1004


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add Scala version of GraphAlgorithm

2015-08-18 Thread PieterJanVanAeken
GitHub user PieterJanVanAeken opened a pull request:

https://github.com/apache/flink/pull/1033

Add Scala version of GraphAlgorithm

The Scala Gelly API already has a wrapper for running java-based 
GraphAlgorithms. It is needed for running pre-built algorithms such as the 
PageRankAlgorithm. This small addition maintains compatibility with these 
java-based GraphAlgorithms while giving the user an additional possibility of 
developing their own GraphAlgorithm that makes use of the syntactic sugar 
methods that the new Scala Gelly API offers.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PieterJanVanAeken/flink scala-gelly-algorithm

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1033.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1033


commit 0a5c46314ba2cf1e4bb8d7703b17c4cde1fbca51
Author: Pieter-Jan Van Aeken pieterjan.vanae...@euranova.eu
Date:   2015-08-18T14:48:44Z

Add Scala version of Graph Algorithm that allows for syntactic sugar inside 
the algorithm.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add Scala version of GraphAlgorithm

2015-08-18 Thread PieterJanVanAeken
Github user PieterJanVanAeken commented on the pull request:

https://github.com/apache/flink/pull/1033#issuecomment-132241973
  
Please wait a sec with this one. Something is off with my local changes vs 
my repo changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1962] Add Gelly Scala API v2

2015-08-18 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132241841
  
If the APIs are in sync there should be one documentation for both, right?
Similar as with the other APIs...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701396#comment-14701396
 ] 

ASF GitHub Bot commented on FLINK-1962:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1004#issuecomment-132241841
  
If the APIs are in sync there should be one documentation for both, right?
Similar as with the other APIs...


 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: PJ Van Aeken
 Fix For: 0.10






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2529][runtime]remove some unused code

2015-08-18 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/1022#issuecomment-132205784
  
Ha...
Sorry for my English, just misunderstood the execution.
I will learn more about the Thread class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2529) fix on some unused code for flink-runtime

2015-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701246#comment-14701246
 ] 

ASF GitHub Bot commented on FLINK-2529:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/1022#issuecomment-132205784
  
Ha...
Sorry for my English, just misunderstood the execution.
I will learn more about the Thread class.


 fix on some unused code for flink-runtime
 -

 Key: FLINK-2529
 URL: https://issues.apache.org/jira/browse/FLINK-2529
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 In file BlobServer.java, I found the Thread.currentThread() will never return 
 null in my learned knowledge.
 So I think shutdownHook != null“ is not necessary in code 'if (shutdownHook 
 != null  shutdownHook != Thread.currentThread())';
 And I am not complete sure.
 Maybe I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >