Re: Guide/design doc for streaming operator states

2015-07-30 Thread Aljoscha Krettek
Ah ok. I think that keyBy() can normally not be chained because we don't know how the fields in the emitted object change. On Thu, 30 Jul 2015 at 13:40 Gyula Fóra gyula.f...@gmail.com wrote: Thanks for the feedback :) My idea when I wrote that was that you can chain keyBy statements to

[jira] [Created] (FLINK-2436) Make ByteStreamStateHandles more robust

2015-07-30 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-2436: - Summary: Make ByteStreamStateHandles more robust Key: FLINK-2436 URL: https://issues.apache.org/jira/browse/FLINK-2436 Project: Flink Issue Type: Improvement

Re: Hello guys, I have met a problem when use mvn to build flink on mac

2015-07-30 Thread Stephan Ewen
This does not look like a Flink problem, but like a maven plugin bug, or a setup problem. Can you search is that is a known error in maven/scala? On Thu, Jul 30, 2015 at 3:53 PM, skaterQiang xqfly...@163.com wrote: Hello guys, I have met a problem when use mvn to build flink on mac I use

Re: Guide/design doc for streaming operator states

2015-07-30 Thread Gyula Fóra
Thanks for the feedback :) My idea when I wrote that was that you can chain keyBy statements to maintain order if your key does not change. Otherwise you are right, we need a sorting operator. Gyula Aljoscha Krettek aljos...@apache.org ezt írta (időpont: 2015. júl. 30., Cs, 13:18): Hi, sorry

Hello guys, I have met a problem when use mvn to build flink on mac

2015-07-30 Thread skaterQiang
Hello guys, I have met a problem when use mvn to build flink on mac I use mvn install -DskipTests and it throw following error, how do I fix it? [INFO] --- scala-maven-plugin:3.1.4:compile (scala-compile-first) @ flink-runtime --- [INFO]

[jira] [Created] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-07-30 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2437: -- Summary: TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437

Re: Types in the Python API

2015-07-30 Thread Aljoscha Krettek
I believe it should be possible to create a special PythonTypeInfo where the python side is responsible for serializing data to a byte array and to the java side it is just a byte array and all the comparisons are also performed on these byte arrays. I think partitioning and sort should still

[jira] [Created] (FLINK-2442) PojoType fields not supported by field position keys

2015-07-30 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2442: Summary: PojoType fields not supported by field position keys Key: FLINK-2442 URL: https://issues.apache.org/jira/browse/FLINK-2442 Project: Flink Issue

Re: Types in the Python API

2015-07-30 Thread Chesnay Schepler
because it still goes through the Java API that requires some kind of type information. imagine a java api program where you omit all generic types, it just wouldn't work as of now. On 30.07.2015 21:17, Gyula Fóra wrote: Hey! Could anyone briefly tell me what exactly is the reason why we

Re: Types in the Python API

2015-07-30 Thread Chesnay Schepler
I can see this working for basic types, but am unsure how it would work with Tuples. Wouldn't the java API still need to know the arity to setup serializers? On 30.07.2015 23:02, Aljoscha Krettek wrote: I believe it should be possible to create a special PythonTypeInfo where the python side

Re: Types in the Python API

2015-07-30 Thread Chesnay Schepler
To be perfectly honest i never really managed to work my way through Spark's python API, it's a whole bunch of magic to me; not even the general structure is understandable. With pure python do you mean doing everything in python? as in just having serialized data on the java side? I

[jira] [Created] (FLINK-2445) Add tests for HadoopOutputFormats

2015-07-30 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2445: Summary: Add tests for HadoopOutputFormats Key: FLINK-2445 URL: https://issues.apache.org/jira/browse/FLINK-2445 Project: Flink Issue Type: Test

[jira] [Created] (FLINK-2446) SocketTextStreamFunction has memory leak when reconnect server

2015-07-30 Thread fangfengbin (JIRA)
fangfengbin created FLINK-2446: -- Summary: SocketTextStreamFunction has memory leak when reconnect server Key: FLINK-2446 URL: https://issues.apache.org/jira/browse/FLINK-2446 Project: Flink

Re: Types in the Python API

2015-07-30 Thread Aljoscha Krettek
I think then the Python part would just serialize all the tuple fields to a big byte array. And all the key fields to another array, so that the java side can to comparisons on the whole key blob. Maybe it's overly simplistic, but it might work. :D On Thu, 30 Jul 2015 at 23:35 Chesnay Schepler

[jira] [Created] (FLINK-2440) [py] Expand Environment feature coverage

2015-07-30 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-2440: --- Summary: [py] Expand Environment feature coverage Key: FLINK-2440 URL: https://issues.apache.org/jira/browse/FLINK-2440 Project: Flink Issue Type:

[jira] [Created] (FLINK-2441) [py] Introduce an OpInfo object on the python side

2015-07-30 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-2441: --- Summary: [py] Introduce an OpInfo object on the python side Key: FLINK-2441 URL: https://issues.apache.org/jira/browse/FLINK-2441 Project: Flink Issue

[jira] [Created] (FLINK-2434) org.apache.hadoop:hadoop-yarn-common:jar with value 'jersey-test-framework-grizzly2+' does not match a valid id pattern

2015-07-30 Thread caofangkun (JIRA)
caofangkun created FLINK-2434: - Summary: org.apache.hadoop:hadoop-yarn-common:jar with value 'jersey-test-framework-grizzly2+' does not match a valid id pattern Key: FLINK-2434 URL:

Re: A soft reminder

2015-07-30 Thread Andra Lungu
Hi Gabor, Within a delta iteration right? On Thu, Jul 30, 2015 at 6:31 PM, Gábor Gévay gga...@gmail.com wrote: Hi, I have also run into this problem just now. It only happens with much data. Best regards, Gabor 2015-07-27 11:35 GMT+02:00 Felix Neutatz neut...@googlemail.com: Hi,

[jira] [Created] (FLINK-2438) Improve performance of channel events

2015-07-30 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2438: --- Summary: Improve performance of channel events Key: FLINK-2438 URL: https://issues.apache.org/jira/browse/FLINK-2438 Project: Flink Issue Type: Bug

Re: A soft reminder

2015-07-30 Thread Gábor Gévay
Hi, I have also run into this problem just now. It only happens with much data. Best regards, Gabor 2015-07-27 11:35 GMT+02:00 Felix Neutatz neut...@googlemail.com: Hi, I also encountered the EOF exception for a delta iteration with more data. With less data it works ... Best regards,

Re: A soft reminder

2015-07-30 Thread Gábor Gévay
Yes, in a VertexCentricIteration with a few million nodes, running locally on my laptop with about 10 GB of memory given to java. Best, Gabor 2015-07-30 18:32 GMT+02:00 Andra Lungu lungu.an...@gmail.com: Hi Gabor, Within a delta iteration right? On Thu, Jul 30, 2015 at 6:31 PM, Gábor

Re: A soft reminder

2015-07-30 Thread Gábor Gévay
It is working with setSolutionSetUnmanagedMemory(true), thanks! Gabor 2015-07-30 19:23 GMT+02:00 Andra Lungu lungu.an...@gmail.com: Could you try adding the following lines to your code? VertexCentricConfiguration parameters = new VertexCentricConfiguration();

[jira] [Created] (FLINK-2439) [py] Expand DataSet feature coverage

2015-07-30 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-2439: --- Summary: [py] Expand DataSet feature coverage Key: FLINK-2439 URL: https://issues.apache.org/jira/browse/FLINK-2439 Project: Flink Issue Type:

Types in the Python API

2015-07-30 Thread Gyula Fóra
Hey! Could anyone briefly tell me what exactly is the reason why we force the users in the Python API to declare types for operators? I don't really understand how this works in different systems but I am just curious why Flink has types and why Spark doesn't for instance. If you give me some

Re: question about SlidingPreReducer.java

2015-07-30 Thread Till Rohrmann
Hi MaGuoWei, do you mean that the branch of the if statement (line 130) is never executed? Or are you looking for an example which uses the SlidingPreReducer ? Cheers, Till ​ On Thu, Jul 30, 2015 at 11:33 AM, MaGuoWei maguo...@outlook.com wrote: hi guysThere is a function updateCurrent() in

Re: question about SlidingPreReducer.java

2015-07-30 Thread Till Rohrmann
But if the updateCurrent method is called at least twice with a value different from null, then the else branch should be executed if I’m not mistaken. In the first call currentReduced will be sets to something other than null (if branch) and in the second call the reducer will be called with the

Re: question about SlidingPreReducer.java

2015-07-30 Thread Aljoscha Krettek
Hi, I also had this suspicion in the past. The sliding pre reducers are horribly slow. For example, this code: ds.window(...).every(...).mapWindow(new MyWindowMapReducer()) is a lot faster than this: ds.window(...).every(...).reduceWindow(new MyWindowReducer()) We are currently working on this

RE: question about SlidingPreReducer.java

2015-07-30 Thread MaGuoWei
I find function addToBufferIfEligible always resets currentReduced to null so that branch can never be reached. (SlidingCountPreReducer.java) Date: Thu, 30 Jul 2015 12:06:54 +0200 Subject: Re: question about SlidingPreReducer.java From: till.rohrm...@gmail.com To: dev@flink.apache.org

[jira] [Created] (FLINK-2435) Add support for custom CSV field parsers

2015-07-30 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2435: Summary: Add support for custom CSV field parsers Key: FLINK-2435 URL: https://issues.apache.org/jira/browse/FLINK-2435 Project: Flink Issue Type: New

question about SlidingPreReducer.java

2015-07-30 Thread MaGuoWei
hi guysThere is a function updateCurrent() in this class(SlidingPreReducer).I think there is no chance to run the following code in this function:currentReduced = reducer.reduce(serializer.copy(currentReduced), element);Can any one give me a example that can run this code. (I have already see

Re: Guide/design doc for streaming operator states

2015-07-30 Thread Aljoscha Krettek
Hi, sorry for the long wait but I finally found the time to read it. It looks good but the later parts of course still need to be fleshed out. I have one comments/questions: In the description of partitioned state you have this sentence: Operations using partitioned state can also benefit from