Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Aljoscha Krettek
I commented on the issue with a way that should work. On Fri, 9 Dec 2016 at 01:00 Chesnay Schepler wrote: > Done. https://issues.apache.org/jira/browse/FLINK-5299 > > On 08.12.2016 16:50, Ufuk Celebi wrote: > > Would you like to open an issue for this for starters Chesnay?

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Chesnay Schepler
Done. https://issues.apache.org/jira/browse/FLINK-5299 On 08.12.2016 16:50, Ufuk Celebi wrote: Would you like to open an issue for this for starters Chesnay? Would be good to fix for the upcoming release even. On 8 December 2016 at 16:39:58, Chesnay Schepler (ches...@apache.org) wrote: It

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Ufuk Celebi
Would you like to open an issue for this for starters Chesnay? Would be good to fix for the upcoming release even. On 8 December 2016 at 16:39:58, Chesnay Schepler (ches...@apache.org) wrote: > It would be neat if we could support arrays as keys directly; it should > boil down to checking the

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Chesnay Schepler
It would be neat if we could support arrays as keys directly; it should boil down to checking the key type and in case of an array injecting a KeySelector that calls Arrays.hashCode(array). This worked for me when i ran into the same issue while experimenting with some stuff. The batch API

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Ufuk Celebi
@Aljoscha: I remember that someone else ran into this, too. Should we address arrays as keys specifically in the API? Prohibit? Document this? – Ufuk On 7 December 2016 at 17:41:40, Andrew Roberts (arobe...@fuze.com) wrote: > Sure! > > (Aside, it turns out that the issue was using an

Re: Parallelism and stateful mapping with Flink

2016-12-07 Thread Andrew Roberts
Sure! (Aside, it turns out that the issue was using an `Array[Byte]` as a key - byte arrays don’t appear to have a stable hashCode. I’ll provide the skeleton for fullness, though.) val env = StreamExecutionEnvironment.getExecutionEnvironment

Re: Parallelism and stateful mapping with Flink

2016-12-07 Thread Stefan Richter
Hi, could you maybe provide the (minimal) code for the problematic job? Also, are you sure that the keyBy is working on the correct key attribute? Best, Stefan > Am 07.12.2016 um 15:57 schrieb Andrew Roberts : > > Hello, > > I’m trying to perform a stateful mapping of some

Parallelism and stateful mapping with Flink

2016-12-07 Thread Andrew Roberts
Hello, I’m trying to perform a stateful mapping of some objects coming in from Kafka in a parallelized flink job (set on the job using env.setParallelism(3)). The data source is a kafka topic, but the partitions aren’t meaningfully keyed for this operation (each kafka message is flatMapped to