Re: keyBy and parallelism

2018-04-12 Thread Ken Krugler
I’m not sure I understand the actual use case, but … Using a rebalance() to randomly distribute keys to operators is what I think you’d need to do to support “even if I have less keys that slots, I wants each slot to take his share in the work” So it sounds like you want to (a) broadcast all ru

Re: keyBy and parallelism

2018-04-12 Thread Christophe Jolif
Sihua, On Thu, Apr 12, 2018 at 10:04 AM, 周思华 wrote: > Hi Christophe, > I think what you want to do is "stream join", and I'm a bit confuse that > if you have know there are only 8 keys then why would you still like to > use 16 parallelisms? 8 of them will be idle(a waste of CPU). In the > Keye

Re: keyBy and parallelism

2018-04-12 Thread 周思华
Hi Christophe, I think what you want to do is "stream join", and I'm a bit confuse that if you have know there are only 8 keys then why would you still like to use 16 parallelisms? 8 of them will be idle(a waste of CPU). In the KeyedStream, the tuples with the same key will be sent to the same

Re: keyBy and parallelism

2018-04-12 Thread Christophe Jolif
Thanks Chesnay (and others). That's what I was figuring out. Now let's go onto the follow up with my exact use-case. I have two streams A and B. A basically receives "rules" that the processing of B should observe to process. There is a "key" that allows me to know that a rule x coming in A is f

Re: keyBy and parallelism

2018-04-12 Thread Chesnay Schepler
You will get 16 parallel executions since you specify a parallellism of 16, however 8 of these will not get any data. On 11.04.2018 23:29, Hao Sun wrote: From what I learnt, you have to control parallelism your self. You can set parallelism on operator or set default one through flink-config.ya

Re: keyBy and parallelism

2018-04-11 Thread Hao Sun
>From what I learnt, you have to control parallelism your self. You can set parallelism on operator or set default one through flink-config.yaml. I might be wrong. On Wed, Apr 11, 2018 at 2:16 PM Christophe Jolif wrote: > Hi all, > > Imagine I have a default parallelism of 16 and I do something

keyBy and parallelism

2018-04-11 Thread Christophe Jolif
Hi all, Imagine I have a default parallelism of 16 and I do something like stream.keyBy("something").flatMap() Now let's imagine I have less than 16 keys, maybe 8. How many parallel executions of the flatMap function will I get? 8 because I have 8 keys, or 16 because I have default parallelism