Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-17 Thread Artem Livshits
Hi all, Thank you to everybody who contributed to this discussion, we've come a long way and now the "rejected alternatives" section is almost as big as the actual proposal (which indicates that it's a well thought through solution :-)). If there are no further considerations, I'll start voting

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-14 Thread Artem Livshits
Hi Jun, 33. Sounds good. Updated the KIP. -Artem On Mon, Mar 14, 2022 at 5:45 PM Jun Rao wrote: > Hi, Artem, > > 33. We introduced onNewBatch() primarily for the sticky partitioner. It > seems to be a very subtle API to explain and to use properly. If we can't > find any convincing usage,

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-14 Thread Jun Rao
Hi, Artem, 33. We introduced onNewBatch() primarily for the sticky partitioner. It seems to be a very subtle API to explain and to use properly. If we can't find any convincing usage, it's probably better to deprecate it so that we could keep the API clean. Thanks, Jun On Mon, Mar 14, 2022 at

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-14 Thread Artem Livshits
Hi Jun, 33. That's an interesting point. Technically, onNewBatch is just a way to pass some signal to the partitioner, the sticky partitioner uses this signal that is suboptimal, in theory someone could use it for something else -Artem On Mon, Mar 14, 2022 at 9:11 AM Jun Rao wrote: > Hi,

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-14 Thread Jun Rao
Hi, Artem, Thanks for the reply. A couple of more comments. 32. We could defer the metrics until we know what to add. 33. Since we are deprecating DefaultPartitioner and UniformStickyPartitioner, should we depreciate Partitioner.onNewBatch() too given its unexpected side effect? Thanks, Jun

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-10 Thread Artem Livshits
Hi Jun, 32. Good point. Do you think it's ok to defer the metrics until we run some benchmarks so that we get a better idea of what metrics we need? -Artem On Thu, Mar 10, 2022 at 3:12 PM Jun Rao wrote: > Hi, Artem. > > Thanks for the reply. One more comment. > > 32. Do we need to add any

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-10 Thread Jun Rao
Hi, Artem. Thanks for the reply. One more comment. 32. Do we need to add any new metric on the producer? For example, if partitioner.availability.timeout.ms is > 0, it might be useful to know the number of unavailable partitions. Thanks, Jun On Thu, Mar 10, 2022 at 12:46 PM Artem Livshits

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-10 Thread Artem Livshits
Hi Jun, 30. Clarified. 31. I plan to do some benchmarking once implementation is finished, I'll update the KIP with the results once I have them. The reason to make it default is that it won't be used otherwise and we won't know if it's good or not in practical workloads. -Artem On Thu, Mar

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-10 Thread Jun Rao
Hi, Artem, Thanks for the updated KIP. A couple of more comments. 30. For the 3 new configs, it would be useful to make it clear that they are only relevant when the partitioner class is null. 31. partitioner.adaptive.partitioning.enable : I am wondering whether it should default to true. This

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-09 Thread Artem Livshits
Thank you for feedback, I've discussed this offline with some of the folks and updated the KIP. The main change is that now instead of using DefaultPartitioner and UniformStickyPartitioners as flags, in the new proposal the default partitioner is null, so if no custom partitioner is specified

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-04 Thread Luke Chen
Hi Artem, Thanks for your explanation and update to the KIP. Some comments: 5. In the description for `enable.adaptive.partitioning`, the `false` case, you said: > the producer will try to distribute messages uniformly. I think we should describe the possible skewing distribution. Otherwise,

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-04 Thread Ismael Juma
Regarding `3`, we should only deprecate it if we're sure the new approach handles all cases better. Are we confident about that for both of the previous partitioners? Ismael On Fri, Mar 4, 2022 at 1:37 AM David Jacot wrote: > Hi Artem, > > Thanks for the KIP! I have a few comments: > > 1. In

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-04 Thread David Jacot
Hi Artem, Thanks for the KIP! I have a few comments: 1. In the preamble of the proposed change section, there is still a mention of the -1 approach. My understanding is that we have moved away from it now. 2. I am a bit concerned by the trick suggested about the DefaultPartitioner and the

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-03 Thread Artem Livshits
Hi Jun, 2. Removed the option from the KIP. Now the sticky partitioning threshold is hardcoded to batch.size. 20. Added the corresponding wording to the KIP. -Artem On Thu, Mar 3, 2022 at 10:52 AM Jun Rao wrote: > Hi, Artem, > > Thanks for the reply. > > 1. Sounds good. > > 2. If we don't

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-03 Thread Jun Rao
Hi, Artem, Thanks for the reply. 1. Sounds good. 2. If we don't expect users to change it, we probably could just leave out the new config. In general, it's easy to add a new config, but hard to remove an existing config. 20. The two new configs enable.adaptive.partitioning and

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-03-03 Thread Artem Livshits
Hi Jun, Thank you for the suggestions. 1. As we discussed offline, we can hardcode the logic for DefaultPartitioner and UniformStickyPartitioner in the KafkaProducer (i.e. the DefaultPartitioner.partition won't get called, instead KafkaProducer would check if the partitioner is an instance of

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-28 Thread Jun Rao
Hi, Artem, Thanks for the reply. A few more comments. 1. Since we control the implementation and the usage of DefaultPartitioner, another way is to instantiate the DefaultPartitioner with a special constructor, which allows it to have more access to internal information. Then we could just

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-25 Thread Artem Livshits
Hi Jun, 1. Updated the KIP to add a couple paragraphs about implementation necessities in the Proposed Changes section. 2. Sorry if my reply was confusing, what I meant to say (and I elaborated on that in point #3) is that there could be patterns for which 16KB wouldn't be the most effective

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-23 Thread Jun Rao
Hi, Artem, Thanks for the reply. A few more comments. 1. Perhaps you could elaborate a bit more on how the producer determines the partition if the partitioner returns -1. This will help understand why encapsulating that logic as a partitioner is not clean. 2. I am not sure that I understand

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-18 Thread Artem Livshits
Hello Luke, Jun, Thank you for your feedback. I've added the Rejected Alternative section that may clarify some of the questions w.r.t. returning -1. 1. I've elaborated on the -1 in the KIP. The problem is that a significant part of the logic needs to be in the producer (because it now uses

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-17 Thread Jun Rao
Hi, Artem, Thanks for the KIP. A few comments below. 1. I agree with Luke that having the partitioner returning -1 is kind of weird. Could we just change the implementation of DefaultPartitioner to the new behavior? 2. partitioner.sticky.batch.size: Similar question to Luke. I am not sure why

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-16 Thread Luke Chen
Hi Artem, Also, one more thing I think you need to know. As this bug KAFKA-7572 mentioned, sometimes the custom partitioner would return negative partition id accidentally. If it returned -1, how could you know if it is expected or not expected?

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-15 Thread Luke Chen
Hi Artem, Thanks for the update. I have some questions about it: 1. Could you explain why you need the `partitioner` return -1? In which case we need it? And how it is used in your KIP? 2. What does the "partitioner.sticky.batch.size" mean? In the "Configuration" part, you didn't explain it. And

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-15 Thread Artem Livshits
Hello, Please add your comments about the KIP. If there are no considerations, I'll put it up for vote in the next few days. -Artem On Mon, Feb 7, 2022 at 6:01 PM Artem Livshits wrote: > Hello, > > After trying a few prototypes, I've made some changes to the public > interface. Please see

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2022-02-07 Thread Artem Livshits
Hello, After trying a few prototypes, I've made some changes to the public interface. Please see the updated document https://cwiki.apache.org/confluence/display/KAFKA/KIP-794%3A+Strictly+Uniform+Sticky+Partitioner . -Artem On Thu, Nov 4, 2021 at 10:37 AM Artem Livshits wrote: > Hello, > >

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2021-11-19 Thread Artem Livshits
Hello, During implementation it turned out that the existing Partitioner.partition method doesn't have enough arguments to accurately estimate record size in bytes (e.g. it doesn't have headers, cannot take compression into account, etc.). So I'm proposing to add a new Partitioner.partition

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2021-11-08 Thread Artem Livshits
Hi Luke, Justine, Thank you for feedback and questions. I've added clarification to the KIP. > there will be some period of time the distribution is not even. That's correct. There would be a small temporary imbalance, but over time the distribution should be uniform. > 1. This paragraph is a

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2021-11-08 Thread Justine Olshan
Hi Artem, Thanks for working on improving the Sticky Partitioner! I had a few questions about this portion: *The batching will continue until either an in-flight batch completes or we hit the N bytes and move to the next partition. This way it takes just 5 records to get to batching mode, not 5

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2021-11-08 Thread Luke Chen
Thanks Artem, It's much better now. I've got your idea. In KIP-480: Sticky Partitioner, we'll change partition (call partitioner) when either 1 of below condition match 1. the batch is full 2. when linger.ms is up But, you are changing the definition, into a "partitioner.sticky.batch.size" size is

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2021-11-05 Thread Artem Livshits
Hi Luke, Thank you for your feedback. I've updated the KIP with your suggestions. 1. Updated with a better example. 2. I removed the reference to ClassicDefaultPartitioner, it was probably confusing. 3. The logic doesn't rely on checking batches, I've updated the proposal to make it more

Re: [DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2021-11-04 Thread Luke Chen
Hi Artem, Thanks for the KIP! And thanks for reminding me to complete KIP-782, soon. :) Back to the KIP, I have some comments: 1. You proposed to have a new config: "partitioner.sticky.batch.size", but I can't see how we're going to use it to make the partitioner better. Please explain more in

[DISCUSS] KIP-794: Strictly Uniform Sticky Partitioner

2021-11-04 Thread Artem Livshits
Hello, This is the discussion thread for https://cwiki.apache.org/confluence/display/KAFKA/KIP-794%3A+Strictly+Uniform+Sticky+Partitioner . The proposal is a bug fix for https://issues.apache.org/jira/browse/KAFKA-10888, but it does include a client config change, therefore we have a KIP to