Hi Artem, Thanks for working on improving the Sticky Partitioner! I had a few questions about this portion:
*The batching will continue until either an in-flight batch completes or we hit the N bytes and move to the next partition. This way it takes just 5 records to get to batching mode, not 5 x number of partition records, and the batching mode will stay longer as we'll be batching while waiting for a request to be completed. As the production rate accelerates, the logic will automatically switch to use larger batches to sustain higher throughput.* *If one of the brokers has higher latency the records for the partitions hosted on that broker are going to form larger batches, but it's still going to be the same *amount records* sent less frequently in larger batches, the logic automatically adapts to that.* I was curious about how the logic automatically switches here. It seems like we are just adding *partitioner.sticky.batch.size *which seems like a static value. Can you go into more detail about this logic? Or clarify something I may have missed. On Mon, Nov 8, 2021 at 1:34 AM Luke Chen <show...@gmail.com> wrote: > Thanks Artem, > It's much better now. > I've got your idea. In KIP-480: Sticky Partitioner, we'll change partition > (call partitioner) when either 1 of below condition match > 1. the batch is full > 2. when linger.ms is up > But, you are changing the definition, into a > "partitioner.sticky.batch.size" size is reached. > > It'll fix the uneven distribution issue, because we did the sent out size > calculation in the producer side. > But it might have another issue that when the producer rate is low, there > will be some period of time the distribution is not even. Ex: > tp-1: 12KB > tp-2: 0KB > tp-3: 0KB > tp-4: 0KB > while the producer is still keeping sending records into tp-1 (because we > haven't reached the 16KB threshold) > Maybe the user should set a good value to "partitioner.sticky.batch.size" > to fix this issue? > > Some comment to the KIP: > 1. This paragraph is a little confusing, because there's no "batch mode" or > "non-batch mode", right? > > > The batching will continue until either an in-flight batch completes or > we hit the N bytes and move to the next partition. This way it takes just > 5 records to get to batching mode, not 5 x number of partition records, and > the batching mode will stay longer as we'll be batching while waiting for a > request to be completed. > > Even with linger.ms=0, before the sender thread is ready, we're always > batching (accumulating) records into batches. So I think the "batch mode" > description is confusing. And that's why I asked you if you have some kind > of "batch switch" here. > > 2. In motivation, you mentioned 1 drawback of current > UniformStickyPartitioner is "the sticky partitioner doesn't create batches > as efficiently", because it sent out a batch with only 1 record (under > linger.ms=0). But I can't tell how you fix this un-efficient issue in the > proposed solution. I still see we sent 1 record within 1 batch. Could you > explain more here? > > Thank you. > Luke > > On Sat, Nov 6, 2021 at 6:41 AM Artem Livshits > <alivsh...@confluent.io.invalid> wrote: > > > Hi Luke, > > > > Thank you for your feedback. I've updated the KIP with your suggestions. > > > > 1. Updated with a better example. > > 2. I removed the reference to ClassicDefaultPartitioner, it was probably > > confusing. > > 3. The logic doesn't rely on checking batches, I've updated the proposal > to > > make it more explicit. > > 4. The primary issue (uneven distribution) is described in the linked > jira, > > copied an example from jira into the KIP as well. > > > > -Artem > > > > > > On Thu, Nov 4, 2021 at 8:34 PM Luke Chen <show...@gmail.com> wrote: > > > > > Hi Artem, > > > Thanks for the KIP! And thanks for reminding me to complete KIP-782, > > soon. > > > :) > > > > > > Back to the KIP, I have some comments: > > > 1. You proposed to have a new config: "partitioner.sticky.batch.size", > > but > > > I can't see how we're going to use it to make the partitioner better. > > > Please explain more in KIP (with an example will be better as > suggestion > > > (4)) > > > 2. In the "Proposed change" section, you take an example to use > > > "ClassicDefaultPartitioner", is that referring to the current default > > > sticky partitioner? I think it'd better you name your proposed > partition > > > with a different name for distinguish between the default one and new > > one. > > > (Although after implementation, we are going to just use the same name) > > > 3. So, if my understanding is correct, you're going to have a "batch" > > > switch, and before the in-flight is full, it's disabled. Otherwise, > we'll > > > enable it. Is that right? Sorry, I don't see any advantage of having > this > > > batch switch. Could you explain more? > > > 4. I think it should be more clear if you can have a clear real example > > in > > > the motivation section, to describe what issue we faced using current > > > sticky partitioner. And in proposed changes section, using the same > > > example, to describe more detail about how you fix this issue with your > > > way. > > > > > > Thank you. > > > Luke > > > > > > On Fri, Nov 5, 2021 at 1:38 AM Artem Livshits > > > <alivsh...@confluent.io.invalid> wrote: > > > > > > > Hello, > > > > > > > > This is the discussion thread for > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-794%3A+Strictly+Uniform+Sticky+Partitioner > > > > . > > > > > > > > The proposal is a bug fix for > > > > https://issues.apache.org/jira/browse/KAFKA-10888, but it does > > include a > > > > client config change, therefore we have a KIP to discuss. > > > > > > > > -Artem > > > > > > > > > >