Hi Artem,
Thanks for working on improving the Sticky Partitioner!

I had a few questions about this portion:

*The batching will continue until either an in-flight batch completes or we
hit the N bytes and move to the next partition.  This way it takes just 5
records to get to batching mode, not 5 x number of partition records, and
the batching mode will stay longer as we'll be batching while waiting for a
request to be completed.  As the production rate accelerates, the logic
will automatically switch to use larger batches to sustain higher
throughput.*

*If one of the brokers has higher latency the records for the partitions
hosted on that broker are going to form larger batches, but it's still
going to be the same *amount records* sent less frequently in larger
batches, the logic automatically adapts to that.*

I was curious about how the logic automatically switches here. It seems
like we are just adding *partitioner.sticky.batch.size *which seems like a
static value. Can you go into more detail about this logic? Or clarify
something I may have missed.

On Mon, Nov 8, 2021 at 1:34 AM Luke Chen <show...@gmail.com> wrote:

> Thanks Artem,
> It's much better now.
> I've got your idea. In KIP-480: Sticky Partitioner, we'll change partition
> (call partitioner) when either 1 of below condition match
> 1. the batch is full
> 2. when linger.ms is up
> But, you are changing the definition, into a
> "partitioner.sticky.batch.size" size is reached.
>
> It'll fix the uneven distribution issue, because we did the sent out size
> calculation in the producer side.
> But it might have another issue that when the producer rate is low, there
> will be some period of time the distribution is not even. Ex:
> tp-1: 12KB
> tp-2: 0KB
> tp-3: 0KB
> tp-4: 0KB
> while the producer is still keeping sending records into tp-1 (because we
> haven't reached the 16KB threshold)
> Maybe the user should set a good value to "partitioner.sticky.batch.size"
> to fix this issue?
>
> Some comment to the KIP:
> 1. This paragraph is a little confusing, because there's no "batch mode" or
> "non-batch mode", right?
>
> > The batching will continue until either an in-flight batch completes or
> we hit the N bytes and move to the next partition.  This way it takes just
> 5 records to get to batching mode, not 5 x number of partition records, and
> the batching mode will stay longer as we'll be batching while waiting for a
> request to be completed.
>
> Even with linger.ms=0, before the sender thread is ready, we're always
> batching (accumulating) records into batches. So I think the "batch mode"
> description is confusing. And that's why I asked you if you have some kind
> of "batch switch" here.
>
> 2. In motivation, you mentioned 1 drawback of current
> UniformStickyPartitioner is "the sticky partitioner doesn't create batches
> as efficiently", because it sent out a batch with only 1 record (under
> linger.ms=0). But I can't tell how you fix this un-efficient issue in the
> proposed solution. I still see we sent 1 record within 1 batch. Could you
> explain more here?
>
> Thank you.
> Luke
>
> On Sat, Nov 6, 2021 at 6:41 AM Artem Livshits
> <alivsh...@confluent.io.invalid> wrote:
>
> > Hi Luke,
> >
> > Thank you for your feedback.  I've updated the KIP with your suggestions.
> >
> > 1. Updated with a better example.
> > 2. I removed the reference to ClassicDefaultPartitioner, it was probably
> > confusing.
> > 3. The logic doesn't rely on checking batches, I've updated the proposal
> to
> > make it more explicit.
> > 4. The primary issue (uneven distribution) is described in the linked
> jira,
> > copied an example from jira into the KIP as well.
> >
> > -Artem
> >
> >
> > On Thu, Nov 4, 2021 at 8:34 PM Luke Chen <show...@gmail.com> wrote:
> >
> > > Hi Artem,
> > > Thanks for the KIP! And thanks for reminding me to complete KIP-782,
> > soon.
> > > :)
> > >
> > > Back to the KIP, I have some comments:
> > > 1. You proposed to have a new config: "partitioner.sticky.batch.size",
> > but
> > > I can't see how we're going to use it to make the partitioner better.
> > > Please explain more in KIP (with an example will be better as
> suggestion
> > > (4))
> > > 2. In the "Proposed change" section, you take an example to use
> > > "ClassicDefaultPartitioner", is that referring to the current default
> > > sticky partitioner? I think it'd better you name your proposed
> partition
> > > with a different name for distinguish between the default one and new
> > one.
> > > (Although after implementation, we are going to just use the same name)
> > > 3. So, if my understanding is correct, you're going to have a "batch"
> > > switch, and before the in-flight is full, it's disabled. Otherwise,
> we'll
> > > enable it. Is that right? Sorry, I don't see any advantage of having
> this
> > > batch switch. Could you explain more?
> > > 4. I think it should be more clear if you can have a clear real example
> > in
> > > the motivation section, to describe what issue we faced using current
> > > sticky partitioner. And in proposed changes section, using the same
> > > example, to describe more detail about how you fix this issue with your
> > > way.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Fri, Nov 5, 2021 at 1:38 AM Artem Livshits
> > > <alivsh...@confluent.io.invalid> wrote:
> > >
> > > > Hello,
> > > >
> > > > This is the discussion thread for
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-794%3A+Strictly+Uniform+Sticky+Partitioner
> > > > .
> > > >
> > > > The proposal is a bug fix for
> > > > https://issues.apache.org/jira/browse/KAFKA-10888, but it does
> > include a
> > > > client config change, therefore we have a KIP to discuss.
> > > >
> > > > -Artem
> > > >
> > >
> >
>

Reply via email to