[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-22 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 Hello @tfournier314, I should have clarified for documentation I meant apart from the docstrings you have added now, we also have to include documentation in the Flink

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-22 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 Hello @thvasilo @greghogan Ok I've updated documentation. I stay tuned for updating code. Regards Thomas --- If your project is set up for it, you can reply to this

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-21 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 Hello @tfournier314, This PR is still missing documentation. After that is done a project committer will have to review it before it gets merged, which might take a while.

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-21 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan @thvasilo What's the next step ? More tests and reviews ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-16 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan Ok I've pushed the code with my tests and some modifications in mapping @thvasilo It seems to work perfectly! --- If your project is set up for it, you can reply to this

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-15 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan Excuse my ignorance, I'm only now learning about Flink internals :) It seems like the issue here was that `partitionByRange` partitions keys in ascending order but we want the end

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-15 Thread greghogan
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2740 `zipWithIndex` preserves the order between partitions (DataSetUtils.java:121). @tfournier314, I don't think it's a problem pushing your current code since we're still discussing the PR. --- If

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-15 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 Hello @tfournier314 I tested your code and it does seem that partitions are sorted only internally, which is expected and `zipWithIndex` is AFAIK unaware of the sorted (as in key range) order

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-14 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan I've not pushed the code yet because my tests are still incorrect. Indeed the following code: val env = ExecutionEnvironment.getExecutionEnvironment val fitData =

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-09 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @thvasilo @greghogan I've updated my code so that I'm streaming instead of caching with a collect(). Does it seem ok for you ? --- If your project is set up for it, you can reply to this email

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-04 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 I've changed my code so that I have now mapping:DataSet[(String,Long)] val mapping = input .mapWith( s => (s, 1) ) .groupBy( 0 ) .reduce( (a, b) => (a._1,

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-02 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 Yes, I've just updated the PR title --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature