Re: KeyBy distribution across taskslots

2019-02-28 Thread Congxian Qiu
Sent: Thursday, February 28, 2019 18:28 > To: Aggarwal, Ajay > Cc: user@flink.apache.org > Subject: Re: KeyBy distribution across taskslots > > Hi, > > The answer is in fact no. > Flink hash-partitions keys into Key Groups [1] which are uniformly assigned > to tasks, i.

Re: KeyBy distribution across taskslots

2019-02-28 Thread Yun Tang
From: Fabian Hueske Sent: Thursday, February 28, 2019 18:28 To: Aggarwal, Ajay Cc: user@flink.apache.org Subject: Re: KeyBy distribution across taskslots Hi, The answer is in fact no. Flink hash-partitions keys into Key Groups [1] which are uniformly assigned

Re: KeyBy distribution across taskslots

2019-02-28 Thread Fabian Hueske
Hi, The answer is in fact no. Flink hash-partitions keys into Key Groups [1] which are uniformly assigned to tasks, i.e., a task can process more than one key group. AFAIK, there are no plans to change this behavior. Stefan (in CC) might be able to give more details on this. Something that might

KeyBy distribution across taskslots

2019-02-27 Thread Aggarwal, Ajay
I couldn’t find reference to it anywhere in the docs, so I thought I will ask here. When I use KeyBy operator, say KeyBy (“customerId”) and some keys (i.e. customers) are way too noisy than others, is there a way to ensure that too many noisy customers do not land on the same taskslot? In