[jira] [Created] (FLINK-2187) KMeans clustering is not present in release-0.9-rc1

2015-06-08 Thread Sachin Goel (JIRA)
Sachin Goel created FLINK-2187: -- Summary: KMeans clustering is not present in release-0.9-rc1 Key: FLINK-2187 URL: https://issues.apache.org/jira/browse/FLINK-2187 Project: Flink Issue Type:

Re: Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Chiwan Park
Hi. I have a problem running `mvn clean verify` command. TaskManagerFailsWithSlotSharingITCase hangs in Oracle JDK 7 (1.7.0_80). But in Oracle JDK 8 the test case doesn’t hang. I’ve investigated about this problem but I cannot found the bug. Regards, Chiwan Park On Jun 9, 2015, at 2:11 AM,

Re: Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Chiwan Park
Hi. I’m very excited about preparing a new major release. :) I just picked two tests. I will report status as soon as possible. Regards, Chiwan Park On Jun 9, 2015, at 1:52 AM, Maximilian Michels m...@apache.org wrote: Hi everyone! As previously discussed, the Flink developer community is

Re: Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Márton Balassi
Added F7 Running against Kafka cluster for me in the doc. Doing it tomorrow. On Mon, Jun 8, 2015 at 7:00 PM, Chiwan Park chiwanp...@icloud.com wrote: Hi. I’m very excited about preparing a new major release. :) I just picked two tests. I will report status as soon as possible. Regards,

Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Maximilian Michels
Hi everyone! As previously discussed, the Flink developer community is very eager to get out a new major release. Apache Flink 0.9.0 will contain lots of new features and many bugfixes. This time, I'll try to coordinate the release process. Feel free to correct me if I'm doing something wrong

Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Ufuk Celebi
Hey Chiwan! Is the problem reproducible? Does it always deadlock? Can you please wait for it to deadlock and then post a stacktrace (jps and jstack) of the process? Please post it to this issue: FLINK-2183. Thanks :) – Ufuk On Monday, June 8, 2015, Chiwan Park chiwanp...@icloud.com

Re: Memleak in the SessionWindowing example

2015-06-08 Thread Gábor Gévay
I have now created the JIRA: https://issues.apache.org/jira/browse/FLINK-2181 Best regards, Gabor 2015-06-08 0:55 GMT+02:00 Robert Metzger rmetz...@apache.org: What is the status of this issue? I think we should at least file a JIRA for it to have it around as a TODO. On Thu, May 28, 2015

[jira] [Created] (FLINK-2181) SessionWindowing example has a memleak

2015-06-08 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2181: -- Summary: SessionWindowing example has a memleak Key: FLINK-2181 URL: https://issues.apache.org/jira/browse/FLINK-2181 Project: Flink Issue Type: Bug

Re: Planning the 0.9 Release

2015-06-08 Thread Márton Balassi
The problem is still there. @Aljoscha: It would be great if you could take it. On Mon, Jun 8, 2015 at 9:41 AM, Gyula Fóra gyf...@apache.org wrote: I agree with Marton. I thought Aljoscha was working on that. On Monday, June 8, 2015, Márton Balassi balassi.mar...@gmail.com wrote: FLINK-2054

Re: Problem with ML pipeline

2015-06-08 Thread Till Rohrmann
You're right Felix. You need to provide the `FitOperation` and `PredictOperation` for the `Predictor` you want to use and the `FitOperation` and `TransformOperation` for all `Transformer`s you want to chain in front of the `Predictor`. Specifying which features to take could be a solution.

[jira] [Created] (FLINK-2182) Add stateful Streaming Sequence Source

2015-06-08 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-2182: --- Summary: Add stateful Streaming Sequence Source Key: FLINK-2182 URL: https://issues.apache.org/jira/browse/FLINK-2182 Project: Flink Issue Type:

Re: Problem with ML pipeline

2015-06-08 Thread Sachin Goel
Yes. I agree too. It makes no sense for the learning algorithm to have extra payload. Only relevant data makes sense. Further, adding ID to the predict operation type definition seems a legitimate choice. +1 from my side. Regards Sachin Goel On Mon, Jun 8, 2015 at 4:06 PM, Theodore Vasiloudis

Re: Problem with ML pipeline

2015-06-08 Thread Felix Neutatz
I am in favor of efficiency. Therefore I would be prefer to introduce new methods, in order to save memory and network traffic. This would also solve the problem of how to come up with ids? Best regards, Felix Am 08.06.2015 12:52 nachm. schrieb Sachin Goel sachingoel0...@gmail.com: I think if

Re: Problem with ML pipeline

2015-06-08 Thread Sachin Goel
That would be better of course. My opinion had to do with not-implementing-exactly-the-same-thing-twice. Perhaps Till could weigh in here. We really do need to come up with a general mechanism for this. Testing labeled vectors has exactly the same problem. I'll look into how Spark and sci-kit

Re: Problem with ML pipeline

2015-06-08 Thread Sachin Goel
I think if the user doesn't provide IDs, we can safely assume that they don't need it. We can just simply assign an ID of one as a temporary measure and return the result, with no IDs [just to make the interface cleaner]. If the IDs are provided, in that case, we simply use those IDs. A possible

Re: Problem with ML pipeline

2015-06-08 Thread Theodore Vasiloudis
I agree with Mikio; ids would be useful overall, and feature selection should not be a part of learning algorithms, all features in a LabeledVector should be assumed to be relevant by the learners. On Mon, Jun 8, 2015 at 12:00 PM, Mikio Braun mikiobr...@googlemail.com wrote: Hi all, I think

Re: Problem with ML pipeline

2015-06-08 Thread Till Rohrmann
My gut feeling is also that a `Transformer` would be a good place to implement feature selection. Then you can simply reuse it across multiple algorithms by simply chaining them together. However, I don't know yet what's the best way to realize the IDs. One way would be to add an ID field to

[jira] [Created] (FLINK-2184) Cannot get last element with maxBy/minBy

2015-06-08 Thread JIRA
Gábor Hermann created FLINK-2184: Summary: Cannot get last element with maxBy/minBy Key: FLINK-2184 URL: https://issues.apache.org/jira/browse/FLINK-2184 Project: Flink Issue Type:

[jira] [Created] (FLINK-2186) Reworj SVM import to support very wide files

2015-06-08 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2186: -- Summary: Reworj SVM import to support very wide files Key: FLINK-2186 URL: https://issues.apache.org/jira/browse/FLINK-2186 Project: Flink Issue