Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2025-01-06 Thread TaiJu Wu
 > >> producer
> > > >> > > is to
> > > >> > > > >> minimize resource usage on edge devices with limited
> hardware
> > > >> > > capabilities.
> > > >> > > > >> Currently, we use a producer pool to handle different acks
> > > >> values,
> > > >> > > which
> > > >> > > > >> requires 3x producer instances. Additionally, this approach
> > > >> creates
> > > >> > > many
> > > >> > > > >> idle producers if a sensor with a specific acks setting
> has no
> > > >> data
> > > >> > > for a
> > > >> > > > >> while.
> > > >> > > > >>
> > > >> > > > >> I love David’s suggestion since the acks configuration is
> > > closely
> > > >> > > related
> > > >> > > > >> to the topic. Maybe we can introduce an optional
> configuration
> > > >> in the
> > > >> > > > >> producer to define topic-level acks, with the existing acks
> > > >> being the
> > > >> > > > >> default for all topics. This approach is not only simple
> but
> > > also
> > > >> > > easy to
> > > >> > > > >> understand and implement.
> > > >> > > > >>
> > > >> > > > >> Best,
> > > >> > > > >> Chia-Ping
> > > >> > > > >>
> > > >> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> > > >> > > > >> > Hi TaiJuWu,
> > > >> > > > >> > I've been thinking for a while about this KIP before
> jumping
> > > >> into
> > > >> > > the
> > > >> > > > >> discussion.
> > > >> > > > >> >
> > > >> > > > >> > I'm afraid that I don't think the approach in the KIP is
> the
> > > >> best,
> > > >> > > > >> given the design
> > > >> > > > >> > of the Kafka protocol in this area. Essentially, each
> Produce
> > > >> > > request
> > > >> > > > >> contains
> > > >> > > > >> > the acks value at the top level, and may contain records
> for
> > > >> many
> > > >> > > > >> topics or
> > > >> > > > >> > partitions. My point is that batching occurs at the
> level of
> > > a
> > > >> > > Produce
> > > >> > > > >> request,
> > > >> > > > >> > so changing the acks value between records will require
> a new
> > > >> > > Produce
> > > >> > > > >> request
> > > >> > > > >> > to be sent. There would likely be an efficiency penalty
> if
> > > this
> > > >> > > feature
> > > >> > > > >> was used
> > > >> > > > >> > heavily with the acks changing record by record.
> > > >> > > > >> >
> > > >> > > > >> > I can see that potentially an application might want
> > > different
> > > >> ack
> > > >> > > > >> levels for
> > > >> > > > >> > different topics, but I would be surprised if they use
> > > >> different ack
> > > >> > > > >> levels within
> > > >> > > > >> > the same topic. Maybe David's suggestion of defining the
> acks
> > > >> per
> > > >> > > topic
> > > >> > > > >> > would be enough. What do you think?
> > > >> > > > >> >
> > > >> > > > >> > Thanks,
> > > >> > > > >> > Andrew
> > > >> > > > >> > 
> > > >> > > > >> > From: David Jacot 
> > > >> > > > >> > Sent: 13 November 2024 15:31
> > > >> > > > >> > To: dev@kafka.apache.org 
> > > >> > > &

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2025-01-06 Thread Kirk True
gt; > > On 2024/11/15 05:12:33 TaiJu Wu wrote:
> > >> > > > Hi all,
> > >> > > >
> > >> > > > I have updated the contents of this KIP
> > >> > > > Please take a look and let me know what you think.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > TaiJuWu
> > >> > > >
> > >> > > > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu 
> > >> wrote:
> > >> > > >
> > >> > > > > Hi all,
> > >> > > > >
> > >> > > > > Thanks for your feeback and @Chia-Ping's help.
> > >> > > > > .
> > >> > > > > I also agree topic-level acks config is more reasonable and it
> > can
> > >> > > simply
> > >> > > > > the story.
> > >> > > > > When I try implementing record-level acks, I notice I don't have
> > >> good
> > >> > > idea
> > >> > > > > to avoid iterating batches for get partition information (need
> > by
> > >> > > > > *RecordAccumulator#partitionChanged*).
> > >> > > > >
> > >> > > > > Back to the init question how can I handle different acks for
> > >> batches:
> > >> > > > > First, we can attach *topic-level acks *to
> > >> > > *RecordAccumulator#TopicInfo*.
> > >> > > > > Second,  we can return *Map>* when
> > >> > > *RecordAccumulator#drainBatchesForOneNode
> > >> > > > > *is called. In this step, we can propagate acks to *sender*.
> > >> > > > > Finally, we can get the acks info and group same acks into a
> > >> > > > > *List>* for a node in
> > *sender#sendProduceRequests*.
> > >> > > > >
> > >> > > > > If I missed something or there is any mistake, please let me
> > know.
> > >> > > > > I will update this KIP later, thank your feedback.
> > >> > > > >
> > >> > > > > Best,
> > >> > > > > TaiJuWu
> > >> > > > >
> > >> > > > >
> > >> > > > > Chia-Ping Tsai  於 2024年11月14日 週四 上午9:46寫道:
> > >> > > > >
> > >> > > > >> hi All
> > >> > > > >>
> > >> > > > >> This KIP is based on our use case where an edge application
> > with
> > >> many
> > >> > > > >> sensors wants to use a single producer to deliver ‘few but
> > >> varied’
> > >> > > records
> > >> > > > >> with different acks settings. The reason for using a single
> > >> producer
> > >> > > is to
> > >> > > > >> minimize resource usage on edge devices with limited hardware
> > >> > > capabilities.
> > >> > > > >> Currently, we use a producer pool to handle different acks
> > >> values,
> > >> > > which
> > >> > > > >> requires 3x producer instances. Additionally, this approach
> > >> creates
> > >> > > many
> > >> > > > >> idle producers if a sensor with a specific acks setting has no
> > >> data
> > >> > > for a
> > >> > > > >> while.
> > >> > > > >>
> > >> > > > >> I love David’s suggestion since the acks configuration is
> > closely
> > >> > > related
> > >> > > > >> to the topic. Maybe we can introduce an optional configuration
> > >> in the
> > >> > > > >> producer to define topic-level acks, with the existing acks
> > >> being the
> > >> > > > >> default for all topics. This approach is not only simple but
> > also
> > >> > > easy to
> > >> > > > >> understand and implement.
> > >> > > > >>
> > >> > > > >> Best,
> > >> > > > >> Chia-Ping
> > >> > > > >>
> > >> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> > >> > > > >> > Hi TaiJuWu,
> > >> &g

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2025-01-03 Thread Divij Vaidya
; > When I try implementing record-level acks, I notice I don't have
> >> good
> >> > > idea
> >> > > > > to avoid iterating batches for get partition information (need
> by
> >> > > > > *RecordAccumulator#partitionChanged*).
> >> > > > >
> >> > > > > Back to the init question how can I handle different acks for
> >> batches:
> >> > > > > First, we can attach *topic-level acks *to
> >> > > *RecordAccumulator#TopicInfo*.
> >> > > > > Second,  we can return *Map>* when
> >> > > *RecordAccumulator#drainBatchesForOneNode
> >> > > > > *is called. In this step, we can propagate acks to *sender*.
> >> > > > > Finally, we can get the acks info and group same acks into a
> >> > > > > *List>* for a node in
> *sender#sendProduceRequests*.
> >> > > > >
> >> > > > > If I missed something or there is any mistake, please let me
> know.
> >> > > > > I will update this KIP later, thank your feedback.
> >> > > > >
> >> > > > > Best,
> >> > > > > TaiJuWu
> >> > > > >
> >> > > > >
> >> > > > > Chia-Ping Tsai  於 2024年11月14日 週四 上午9:46寫道:
> >> > > > >
> >> > > > >> hi All
> >> > > > >>
> >> > > > >> This KIP is based on our use case where an edge application
> with
> >> many
> >> > > > >> sensors wants to use a single producer to deliver ‘few but
> >> varied’
> >> > > records
> >> > > > >> with different acks settings. The reason for using a single
> >> producer
> >> > > is to
> >> > > > >> minimize resource usage on edge devices with limited hardware
> >> > > capabilities.
> >> > > > >> Currently, we use a producer pool to handle different acks
> >> values,
> >> > > which
> >> > > > >> requires 3x producer instances. Additionally, this approach
> >> creates
> >> > > many
> >> > > > >> idle producers if a sensor with a specific acks setting has no
> >> data
> >> > > for a
> >> > > > >> while.
> >> > > > >>
> >> > > > >> I love David’s suggestion since the acks configuration is
> closely
> >> > > related
> >> > > > >> to the topic. Maybe we can introduce an optional configuration
> >> in the
> >> > > > >> producer to define topic-level acks, with the existing acks
> >> being the
> >> > > > >> default for all topics. This approach is not only simple but
> also
> >> > > easy to
> >> > > > >> understand and implement.
> >> > > > >>
> >> > > > >> Best,
> >> > > > >> Chia-Ping
> >> > > > >>
> >> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> >> > > > >> > Hi TaiJuWu,
> >> > > > >> > I've been thinking for a while about this KIP before jumping
> >> into
> >> > > the
> >> > > > >> discussion.
> >> > > > >> >
> >> > > > >> > I'm afraid that I don't think the approach in the KIP is the
> >> best,
> >> > > > >> given the design
> >> > > > >> > of the Kafka protocol in this area. Essentially, each Produce
> >> > > request
> >> > > > >> contains
> >> > > > >> > the acks value at the top level, and may contain records for
> >> many
> >> > > > >> topics or
> >> > > > >> > partitions. My point is that batching occurs at the level of
> a
> >> > > Produce
> >> > > > >> request,
> >> > > > >> > so changing the acks value between records will require a new
> >> > > Produce
> >> > > > >> request
> >> > > > >> > to be sent. There would likely be an efficiency penalty if
> this
> >> > > feature
> >> > > > >> was used
> >> > 

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2025-01-02 Thread TaiJu Wu
The reason for using a single
>> producer
>> > > is to
>> > > > >> minimize resource usage on edge devices with limited hardware
>> > > capabilities.
>> > > > >> Currently, we use a producer pool to handle different acks
>> values,
>> > > which
>> > > > >> requires 3x producer instances. Additionally, this approach
>> creates
>> > > many
>> > > > >> idle producers if a sensor with a specific acks setting has no
>> data
>> > > for a
>> > > > >> while.
>> > > > >>
>> > > > >> I love David’s suggestion since the acks configuration is closely
>> > > related
>> > > > >> to the topic. Maybe we can introduce an optional configuration
>> in the
>> > > > >> producer to define topic-level acks, with the existing acks
>> being the
>> > > > >> default for all topics. This approach is not only simple but also
>> > > easy to
>> > > > >> understand and implement.
>> > > > >>
>> > > > >> Best,
>> > > > >> Chia-Ping
>> > > > >>
>> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
>> > > > >> > Hi TaiJuWu,
>> > > > >> > I've been thinking for a while about this KIP before jumping
>> into
>> > > the
>> > > > >> discussion.
>> > > > >> >
>> > > > >> > I'm afraid that I don't think the approach in the KIP is the
>> best,
>> > > > >> given the design
>> > > > >> > of the Kafka protocol in this area. Essentially, each Produce
>> > > request
>> > > > >> contains
>> > > > >> > the acks value at the top level, and may contain records for
>> many
>> > > > >> topics or
>> > > > >> > partitions. My point is that batching occurs at the level of a
>> > > Produce
>> > > > >> request,
>> > > > >> > so changing the acks value between records will require a new
>> > > Produce
>> > > > >> request
>> > > > >> > to be sent. There would likely be an efficiency penalty if this
>> > > feature
>> > > > >> was used
>> > > > >> > heavily with the acks changing record by record.
>> > > > >> >
>> > > > >> > I can see that potentially an application might want different
>> ack
>> > > > >> levels for
>> > > > >> > different topics, but I would be surprised if they use
>> different ack
>> > > > >> levels within
>> > > > >> > the same topic. Maybe David's suggestion of defining the acks
>> per
>> > > topic
>> > > > >> > would be enough. What do you think?
>> > > > >> >
>> > > > >> > Thanks,
>> > > > >> > Andrew
>> > > > >> > 
>> > > > >> > From: David Jacot 
>> > > > >> > Sent: 13 November 2024 15:31
>> > > > >> > To: dev@kafka.apache.org 
>> > > > >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for
>> > > producers
>> > > > >> >
>> > > > >> > Hi TaiJuWu,
>> > > > >> >
>> > > > >> > Thanks for the KIP.
>> > > > >> >
>> > > > >> > The motivation is not clear to me. Could you please elaborate
>> a bit
>> > > > >> more on
>> > > > >> > it?
>> > > > >> >
>> > > > >> > My concern is that it adds a lot of complexity and the added
>> value
>> > > > >> seems to
>> > > > >> > be low. Moreover, it will make reasoning about an application
>> from
>> > > the
>> > > > >> > server side more difficult because we can no longer assume
>> that it
>> > > > >> writes
>> > > > >> > with the ack based on the config. Another issue is about the
>> > > batching,
>> > > > >&

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-23 Thread TaiJu Wu
cks configuration is closely
> > > related
> > > > >> to the topic. Maybe we can introduce an optional configuration in
> the
> > > > >> producer to define topic-level acks, with the existing acks being
> the
> > > > >> default for all topics. This approach is not only simple but also
> > > easy to
> > > > >> understand and implement.
> > > > >>
> > > > >> Best,
> > > > >> Chia-Ping
> > > > >>
> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> > > > >> > Hi TaiJuWu,
> > > > >> > I've been thinking for a while about this KIP before jumping
> into
> > > the
> > > > >> discussion.
> > > > >> >
> > > > >> > I'm afraid that I don't think the approach in the KIP is the
> best,
> > > > >> given the design
> > > > >> > of the Kafka protocol in this area. Essentially, each Produce
> > > request
> > > > >> contains
> > > > >> > the acks value at the top level, and may contain records for
> many
> > > > >> topics or
> > > > >> > partitions. My point is that batching occurs at the level of a
> > > Produce
> > > > >> request,
> > > > >> > so changing the acks value between records will require a new
> > > Produce
> > > > >> request
> > > > >> > to be sent. There would likely be an efficiency penalty if this
> > > feature
> > > > >> was used
> > > > >> > heavily with the acks changing record by record.
> > > > >> >
> > > > >> > I can see that potentially an application might want different
> ack
> > > > >> levels for
> > > > >> > different topics, but I would be surprised if they use
> different ack
> > > > >> levels within
> > > > >> > the same topic. Maybe David's suggestion of defining the acks
> per
> > > topic
> > > > >> > would be enough. What do you think?
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Andrew
> > > > >> > 
> > > > >> > From: David Jacot 
> > > > >> > Sent: 13 November 2024 15:31
> > > > >> > To: dev@kafka.apache.org 
> > > > >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for
> > > producers
> > > > >> >
> > > > >> > Hi TaiJuWu,
> > > > >> >
> > > > >> > Thanks for the KIP.
> > > > >> >
> > > > >> > The motivation is not clear to me. Could you please elaborate a
> bit
> > > > >> more on
> > > > >> > it?
> > > > >> >
> > > > >> > My concern is that it adds a lot of complexity and the added
> value
> > > > >> seems to
> > > > >> > be low. Moreover, it will make reasoning about an application
> from
> > > the
> > > > >> > server side more difficult because we can no longer assume that
> it
> > > > >> writes
> > > > >> > with the ack based on the config. Another issue is about the
> > > batching,
> > > > >> how
> > > > >> > do you plan to handle batches mixing records with different
> acks?
> > > > >> >
> > > > >> > An alternative approach may be to define the ack per topic. We
> could
> > > > >> even
> > > > >> > think about defining it on the server side as a topic config. I
> > > haven't
> > > > >> > really thought about it but it may be something to explore a bit
> > > more.
> > > > >> >
> > > > >> > Best,
> > > > >> > David
> > > > >> >
> > > > >> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
> > > > >> >  wrote:
> > > > >> >
> > > > >> > > Hi TaiJuWu,
> > > > >> > >
> > > > >> > > I find this adding lot's of complexity and I am still not
> > > convinced
> > > > >> by the

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-20 Thread Chia-Ping Tsai
; >
> > > >> > I'm afraid that I don't think the approach in the KIP is the best,
> > > >> given the design
> > > >> > of the Kafka protocol in this area. Essentially, each Produce
> > request
> > > >> contains
> > > >> > the acks value at the top level, and may contain records for many
> > > >> topics or
> > > >> > partitions. My point is that batching occurs at the level of a
> > Produce
> > > >> request,
> > > >> > so changing the acks value between records will require a new
> > Produce
> > > >> request
> > > >> > to be sent. There would likely be an efficiency penalty if this
> > feature
> > > >> was used
> > > >> > heavily with the acks changing record by record.
> > > >> >
> > > >> > I can see that potentially an application might want different ack
> > > >> levels for
> > > >> > different topics, but I would be surprised if they use different ack
> > > >> levels within
> > > >> > the same topic. Maybe David's suggestion of defining the acks per
> > topic
> > > >> > would be enough. What do you think?
> > > >> >
> > > >> > Thanks,
> > > >> > Andrew
> > > >> > 
> > > >> > From: David Jacot 
> > > >> > Sent: 13 November 2024 15:31
> > > >> > To: dev@kafka.apache.org 
> > > >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for
> > producers
> > > >> >
> > > >> > Hi TaiJuWu,
> > > >> >
> > > >> > Thanks for the KIP.
> > > >> >
> > > >> > The motivation is not clear to me. Could you please elaborate a bit
> > > >> more on
> > > >> > it?
> > > >> >
> > > >> > My concern is that it adds a lot of complexity and the added value
> > > >> seems to
> > > >> > be low. Moreover, it will make reasoning about an application from
> > the
> > > >> > server side more difficult because we can no longer assume that it
> > > >> writes
> > > >> > with the ack based on the config. Another issue is about the
> > batching,
> > > >> how
> > > >> > do you plan to handle batches mixing records with different acks?
> > > >> >
> > > >> > An alternative approach may be to define the ack per topic. We could
> > > >> even
> > > >> > think about defining it on the server side as a topic config. I
> > haven't
> > > >> > really thought about it but it may be something to explore a bit
> > more.
> > > >> >
> > > >> > Best,
> > > >> > David
> > > >> >
> > > >> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
> > > >> >  wrote:
> > > >> >
> > > >> > > Hi TaiJuWu,
> > > >> > >
> > > >> > > I find this adding lot's of complexity and I am still not
> > convinced
> > > >> by the
> > > >> > > added value. IMO creating a producer instance per ack level is not
> > > >> > > problematic and the behavior is clear for developers. What would
> > be
> > > >> the
> > > >> > > added value of the proposed change ?
> > > >> > >
> > > >> > > Regards,
> > > >> > >
> > > >> > >
> > > >> > > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu 
> > wrote:
> > > >> > >
> > > >> > > > Hi Fred and Greg,
> > > >> > > >
> > > >> > > > Thanks for your feedback and it really not straightforward but
> > > >> > > interesting!
> > > >> > > > There are some behavior I expect.
> > > >> > > >
> > > >> > > > The current producer uses the *RecordAccumulator* to gather
> > > >> records, and
> > > >> > > > the sender thread sends them in batches. We can track each
> > record’s
> > > >> > > > acknowledgment setting as it appends to the *RecordAccumulator*,
> > > >> allowing
&

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-18 Thread TaiJu Wu
Hi Chia-Ping,

Thanks for your suggestions and feedback.

Q1: I have updated this according your suggestions.
Q2: This is necessary change since there is a assumption about
*RecourdAccumulator
*that all records have same acks(e.g. ProducerConfig.acks) so we need to a
method to distinguish which acks belong to each Batch.

Best,
TaiJuWu

Chia-Ping Tsai  於 2024年11月18日 週一 上午2:17寫道:

> hi TaiJuWu
>
> Q0:
>
> `Format: topic.acks`  the dot is acceptable character in topic naming, so
> maybe we should reverse the format to "acks.${topic}" to get the acks of
> topic easily
>
> Q1: `Return Map> when
> RecordAccumulator#drainBatchesForOneNode is called.`
>
> this is weird to me, as all we need to do is pass `Map to
> `Sender` and make sure `Sender#sendProduceRequest` add correct acks to
> ProduceRequest, right?
>
> Best,
> Chia-Ping
>
>
>
> On 2024/11/15 05:12:33 TaiJu Wu wrote:
> > Hi all,
> >
> > I have updated the contents of this KIP
> > Please take a look and let me know what you think.
> >
> > Thanks,
> > TaiJuWu
> >
> > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu  wrote:
> >
> > > Hi all,
> > >
> > > Thanks for your feeback and @Chia-Ping's help.
> > > .
> > > I also agree topic-level acks config is more reasonable and it can
> simply
> > > the story.
> > > When I try implementing record-level acks, I notice I don't have good
> idea
> > > to avoid iterating batches for get partition information (need by
> > > *RecordAccumulator#partitionChanged*).
> > >
> > > Back to the init question how can I handle different acks for batches:
> > > First, we can attach *topic-level acks *to
> *RecordAccumulator#TopicInfo*.
> > > Second,  we can return *Map>* when
> *RecordAccumulator#drainBatchesForOneNode
> > > *is called. In this step, we can propagate acks to *sender*.
> > > Finally, we can get the acks info and group same acks into a
> > > *List>* for a node in *sender#sendProduceRequests*.
> > >
> > > If I missed something or there is any mistake, please let me know.
> > > I will update this KIP later, thank your feedback.
> > >
> > > Best,
> > > TaiJuWu
> > >
> > >
> > > Chia-Ping Tsai  於 2024年11月14日 週四 上午9:46寫道:
> > >
> > >> hi All
> > >>
> > >> This KIP is based on our use case where an edge application with many
> > >> sensors wants to use a single producer to deliver ‘few but varied’
> records
> > >> with different acks settings. The reason for using a single producer
> is to
> > >> minimize resource usage on edge devices with limited hardware
> capabilities.
> > >> Currently, we use a producer pool to handle different acks values,
> which
> > >> requires 3x producer instances. Additionally, this approach creates
> many
> > >> idle producers if a sensor with a specific acks setting has no data
> for a
> > >> while.
> > >>
> > >> I love David’s suggestion since the acks configuration is closely
> related
> > >> to the topic. Maybe we can introduce an optional configuration in the
> > >> producer to define topic-level acks, with the existing acks being the
> > >> default for all topics. This approach is not only simple but also
> easy to
> > >> understand and implement.
> > >>
> > >> Best,
> > >> Chia-Ping
> > >>
> > >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> > >> > Hi TaiJuWu,
> > >> > I've been thinking for a while about this KIP before jumping into
> the
> > >> discussion.
> > >> >
> > >> > I'm afraid that I don't think the approach in the KIP is the best,
> > >> given the design
> > >> > of the Kafka protocol in this area. Essentially, each Produce
> request
> > >> contains
> > >> > the acks value at the top level, and may contain records for many
> > >> topics or
> > >> > partitions. My point is that batching occurs at the level of a
> Produce
> > >> request,
> > >> > so changing the acks value between records will require a new
> Produce
> > >> request
> > >> > to be sent. There would likely be an efficiency penalty if this
> feature
> > >> was used
> > >> > heavily with the acks changing record by record.
> > >> >
> > >> > I can see that potentially 

RE: Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-18 Thread Jazmin Gomez
On 2024/11/14 01:45:57 Chia-Ping Tsai wrote:
> hi All
>
> This KIP is based on our use case where an edge application with many
sensors wants to use a single producer to deliver ‘few but varied’ records
with different acks settings. The reason for using a single producer is to
minimize resource usage on edge devices with limited hardware capabilities.
Currently, we use a producer pool to handle different acks values, which
requires 3x producer instances. Additionally, this approach creates many
idle producers if a sensor with a specific acks setting has no data for a
while.
>
> I love David’s suggestion since the acks configuration is closely related
to the topic. Maybe we can introduce an optional configuration in the
producer to define topic-level acks, with the existing acks being the
default for all topics. This approach is not only simple but also easy to
understand and implement.
>
> Best,
> Chia-Ping
>
> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> > Hi TaiJuWu,
> > I've been thinking for a while about this KIP before jumping into the
discussion.
> >
> > I'm afraid that I don't think the approach in the KIP is the best,
given the design
> > of the Kafka protocol in this area. Essentially, each Produce request
contains
> > the acks value at the top level, and may contain records for many
topics or
> > partitions. My point is that batching occurs at the level of a Produce
request,
> > so changing the acks value between records will require a new Produce
request
> > to be sent. There would likely be an efficiency penalty if this feature
was used
> > heavily with the acks changing record by record.
> >
> > I can see that potentially an application might want different ack
levels for
> > different topics, but I would be surprised if they use different ack
levels within
> > the same topic. Maybe David's suggestion of defining the acks per topic
> > would be enough. What do you think?
> >
> > Thanks,
> > Andrew
> > ________________
> > From: David Jacot 
> > Sent: 13 November 2024 15:31
> > To: dev@kafka.apache.org 
> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for producers
> >
> > Hi TaiJuWu,
> >
> > Thanks for the KIP.
> >
> > The motivation is not clear to me. Could you please elaborate a bit
more on
> > it?
> >
> > My concern is that it adds a lot of complexity and the added value
seems to
> > be low. Moreover, it will make reasoning about an application from the
> > server side more difficult because we can no longer assume that it
writes
> > with the ack based on the config. Another issue is about the batching,
how
> > do you plan to handle batches mixing records with different acks?
> >
> > An alternative approach may be to define the ack per topic. We could
even
> > think about defining it on the server side as a topic config. I haven't
> > really thought about it but it may be something to explore a bit more.
> >
> > Best,
> > David
> >
> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
> >  wrote:
> >
> > > Hi TaiJuWu,
> > >
> > > I find this adding lot's of complexity and I am still not convinced
by the
> > > added value. IMO creating a producer instance per ack level is not
> > > problematic and the behavior is clear for developers. What would be
the
> > > added value of the proposed change ?
> > >
> > > Regards,
> > >
> > >
> > > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu  wrote:
> > >
> > > > Hi Fred and Greg,
> > > >
> > > > Thanks for your feedback and it really not straightforward but
> > > interesting!
> > > > There are some behavior I expect.
> > > >
> > > > The current producer uses the *RecordAccumulator* to gather
records, and
> > > > the sender thread sends them in batches. We can track each record’s
> > > > acknowledgment setting as it appends to the *RecordAccumulator*,
allowing
> > > > the *sender *to group batches by acknowledgment levels and
topicPartition
> > > > when processing.
> > > >
> > > > Regarding the statement, "Callbacks for records being sent to the
same
> > > > partition are guaranteed to execute in order," this is ensured when
> > > > *max.inflight.request
> > > > *is set to 1. We can send records with different acknowledgment
levels in
> > > > the order of acks-0, acks=1, acks=-1. Since we need to send batches
with
> > > > different acknowledgment levels batche

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-17 Thread Chia-Ping Tsai
hi TaiJuWu

Q0:

`Format: topic.acks`  the dot is acceptable character in topic naming, so maybe 
we should reverse the format to "acks.${topic}" to get the acks of topic easily

Q1: `Return Map> when 
RecordAccumulator#drainBatchesForOneNode is called.`

this is weird to me, as all we need to do is pass `Map to 
`Sender` and make sure `Sender#sendProduceRequest` add correct acks to 
ProduceRequest, right?

Best,
Chia-Ping



On 2024/11/15 05:12:33 TaiJu Wu wrote:
> Hi all,
> 
> I have updated the contents of this KIP
> Please take a look and let me know what you think.
> 
> Thanks,
> TaiJuWu
> 
> On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu  wrote:
> 
> > Hi all,
> >
> > Thanks for your feeback and @Chia-Ping's help.
> > .
> > I also agree topic-level acks config is more reasonable and it can simply
> > the story.
> > When I try implementing record-level acks, I notice I don't have good idea
> > to avoid iterating batches for get partition information (need by
> > *RecordAccumulator#partitionChanged*).
> >
> > Back to the init question how can I handle different acks for batches:
> > First, we can attach *topic-level acks *to *RecordAccumulator#TopicInfo*.
> > Second,  we can return *Map>* when 
> > *RecordAccumulator#drainBatchesForOneNode
> > *is called. In this step, we can propagate acks to *sender*.
> > Finally, we can get the acks info and group same acks into a
> > *List>* for a node in *sender#sendProduceRequests*.
> >
> > If I missed something or there is any mistake, please let me know.
> > I will update this KIP later, thank your feedback.
> >
> > Best,
> > TaiJuWu
> >
> >
> > Chia-Ping Tsai  於 2024年11月14日 週四 上午9:46寫道:
> >
> >> hi All
> >>
> >> This KIP is based on our use case where an edge application with many
> >> sensors wants to use a single producer to deliver ‘few but varied’ records
> >> with different acks settings. The reason for using a single producer is to
> >> minimize resource usage on edge devices with limited hardware capabilities.
> >> Currently, we use a producer pool to handle different acks values, which
> >> requires 3x producer instances. Additionally, this approach creates many
> >> idle producers if a sensor with a specific acks setting has no data for a
> >> while.
> >>
> >> I love David’s suggestion since the acks configuration is closely related
> >> to the topic. Maybe we can introduce an optional configuration in the
> >> producer to define topic-level acks, with the existing acks being the
> >> default for all topics. This approach is not only simple but also easy to
> >> understand and implement.
> >>
> >> Best,
> >> Chia-Ping
> >>
> >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> >> > Hi TaiJuWu,
> >> > I've been thinking for a while about this KIP before jumping into the
> >> discussion.
> >> >
> >> > I'm afraid that I don't think the approach in the KIP is the best,
> >> given the design
> >> > of the Kafka protocol in this area. Essentially, each Produce request
> >> contains
> >> > the acks value at the top level, and may contain records for many
> >> topics or
> >> > partitions. My point is that batching occurs at the level of a Produce
> >> request,
> >> > so changing the acks value between records will require a new Produce
> >> request
> >> > to be sent. There would likely be an efficiency penalty if this feature
> >> was used
> >> > heavily with the acks changing record by record.
> >> >
> >> > I can see that potentially an application might want different ack
> >> levels for
> >> > different topics, but I would be surprised if they use different ack
> >> levels within
> >> > the same topic. Maybe David's suggestion of defining the acks per topic
> >> > would be enough. What do you think?
> >> >
> >> > Thanks,
> >> > Andrew
> >> > 
> >> > From: David Jacot 
> >> > Sent: 13 November 2024 15:31
> >> > To: dev@kafka.apache.org 
> >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for producers
> >> >
> >> > Hi TaiJuWu,
> >> >
> >> > Thanks for the KIP.
> >> >
> >> > The motivation is not clear to me. Could you please elaborate a bit
> >> more on
> &g

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-14 Thread TaiJu Wu
Hi all,

I have updated the contents of this KIP
Please take a look and let me know what you think.

Thanks,
TaiJuWu

On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu  wrote:

> Hi all,
>
> Thanks for your feeback and @Chia-Ping's help.
> .
> I also agree topic-level acks config is more reasonable and it can simply
> the story.
> When I try implementing record-level acks, I notice I don't have good idea
> to avoid iterating batches for get partition information (need by
> *RecordAccumulator#partitionChanged*).
>
> Back to the init question how can I handle different acks for batches:
> First, we can attach *topic-level acks *to *RecordAccumulator#TopicInfo*.
> Second,  we can return *Map>* when 
> *RecordAccumulator#drainBatchesForOneNode
> *is called. In this step, we can propagate acks to *sender*.
> Finally, we can get the acks info and group same acks into a
> *List>* for a node in *sender#sendProduceRequests*.
>
> If I missed something or there is any mistake, please let me know.
> I will update this KIP later, thank your feedback.
>
> Best,
> TaiJuWu
>
>
> Chia-Ping Tsai  於 2024年11月14日 週四 上午9:46寫道:
>
>> hi All
>>
>> This KIP is based on our use case where an edge application with many
>> sensors wants to use a single producer to deliver ‘few but varied’ records
>> with different acks settings. The reason for using a single producer is to
>> minimize resource usage on edge devices with limited hardware capabilities.
>> Currently, we use a producer pool to handle different acks values, which
>> requires 3x producer instances. Additionally, this approach creates many
>> idle producers if a sensor with a specific acks setting has no data for a
>> while.
>>
>> I love David’s suggestion since the acks configuration is closely related
>> to the topic. Maybe we can introduce an optional configuration in the
>> producer to define topic-level acks, with the existing acks being the
>> default for all topics. This approach is not only simple but also easy to
>> understand and implement.
>>
>> Best,
>> Chia-Ping
>>
>> On 2024/11/13 16:04:24 Andrew Schofield wrote:
>> > Hi TaiJuWu,
>> > I've been thinking for a while about this KIP before jumping into the
>> discussion.
>> >
>> > I'm afraid that I don't think the approach in the KIP is the best,
>> given the design
>> > of the Kafka protocol in this area. Essentially, each Produce request
>> contains
>> > the acks value at the top level, and may contain records for many
>> topics or
>> > partitions. My point is that batching occurs at the level of a Produce
>> request,
>> > so changing the acks value between records will require a new Produce
>> request
>> > to be sent. There would likely be an efficiency penalty if this feature
>> was used
>> > heavily with the acks changing record by record.
>> >
>> > I can see that potentially an application might want different ack
>> levels for
>> > different topics, but I would be surprised if they use different ack
>> levels within
>> > the same topic. Maybe David's suggestion of defining the acks per topic
>> > would be enough. What do you think?
>> >
>> > Thanks,
>> > Andrew
>> > 
>> > From: David Jacot 
>> > Sent: 13 November 2024 15:31
>> > To: dev@kafka.apache.org 
>> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for producers
>> >
>> > Hi TaiJuWu,
>> >
>> > Thanks for the KIP.
>> >
>> > The motivation is not clear to me. Could you please elaborate a bit
>> more on
>> > it?
>> >
>> > My concern is that it adds a lot of complexity and the added value
>> seems to
>> > be low. Moreover, it will make reasoning about an application from the
>> > server side more difficult because we can no longer assume that it
>> writes
>> > with the ack based on the config. Another issue is about the batching,
>> how
>> > do you plan to handle batches mixing records with different acks?
>> >
>> > An alternative approach may be to define the ack per topic. We could
>> even
>> > think about defining it on the server side as a topic config. I haven't
>> > really thought about it but it may be something to explore a bit more.
>> >
>> > Best,
>> > David
>> >
>> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
>> >  wrote:
>> >
>> > > Hi TaiJuW

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-13 Thread TaiJu Wu
Hi all,

Thanks for your feeback and @Chia-Ping's help.
.
I also agree topic-level acks config is more reasonable and it can simply
the story.
When I try implementing record-level acks, I notice I don't have good idea
to avoid iterating batches for get partition information (need by
*RecordAccumulator#partitionChanged*).

Back to the init question how can I handle different acks for batches:
First, we can attach *topic-level acks *to *RecordAccumulator#TopicInfo*.
Second,  we can return *Map>* when
*RecordAccumulator#drainBatchesForOneNode
*is called. In this step, we can propagate acks to *sender*.
Finally, we can get the acks info and group same acks into a
*List>* for a node in *sender#sendProduceRequests*.

If I missed something or there is any mistake, please let me know.
I will update this KIP later, thank your feedback.

Best,
TaiJuWu


Chia-Ping Tsai  於 2024年11月14日 週四 上午9:46寫道:

> hi All
>
> This KIP is based on our use case where an edge application with many
> sensors wants to use a single producer to deliver ‘few but varied’ records
> with different acks settings. The reason for using a single producer is to
> minimize resource usage on edge devices with limited hardware capabilities.
> Currently, we use a producer pool to handle different acks values, which
> requires 3x producer instances. Additionally, this approach creates many
> idle producers if a sensor with a specific acks setting has no data for a
> while.
>
> I love David’s suggestion since the acks configuration is closely related
> to the topic. Maybe we can introduce an optional configuration in the
> producer to define topic-level acks, with the existing acks being the
> default for all topics. This approach is not only simple but also easy to
> understand and implement.
>
> Best,
> Chia-Ping
>
> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> > Hi TaiJuWu,
> > I've been thinking for a while about this KIP before jumping into the
> discussion.
> >
> > I'm afraid that I don't think the approach in the KIP is the best, given
> the design
> > of the Kafka protocol in this area. Essentially, each Produce request
> contains
> > the acks value at the top level, and may contain records for many topics
> or
> > partitions. My point is that batching occurs at the level of a Produce
> request,
> > so changing the acks value between records will require a new Produce
> request
> > to be sent. There would likely be an efficiency penalty if this feature
> was used
> > heavily with the acks changing record by record.
> >
> > I can see that potentially an application might want different ack
> levels for
> > different topics, but I would be surprised if they use different ack
> levels within
> > the same topic. Maybe David's suggestion of defining the acks per topic
> > would be enough. What do you think?
> >
> > Thanks,
> > Andrew
> > 
> > From: David Jacot 
> > Sent: 13 November 2024 15:31
> > To: dev@kafka.apache.org 
> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for producers
> >
> > Hi TaiJuWu,
> >
> > Thanks for the KIP.
> >
> > The motivation is not clear to me. Could you please elaborate a bit more
> on
> > it?
> >
> > My concern is that it adds a lot of complexity and the added value seems
> to
> > be low. Moreover, it will make reasoning about an application from the
> > server side more difficult because we can no longer assume that it writes
> > with the ack based on the config. Another issue is about the batching,
> how
> > do you plan to handle batches mixing records with different acks?
> >
> > An alternative approach may be to define the ack per topic. We could even
> > think about defining it on the server side as a topic config. I haven't
> > really thought about it but it may be something to explore a bit more.
> >
> > Best,
> > David
> >
> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
> >  wrote:
> >
> > > Hi TaiJuWu,
> > >
> > > I find this adding lot's of complexity and I am still not convinced by
> the
> > > added value. IMO creating a producer instance per ack level is not
> > > problematic and the behavior is clear for developers. What would be the
> > > added value of the proposed change ?
> > >
> > > Regards,
> > >
> > >
> > > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu  wrote:
> > >
> > > > Hi Fred and Greg,
> > > >
> > > > Thanks for your feedback and it really not straightforward but
> > &

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-13 Thread Chia-Ping Tsai
hi All

This KIP is based on our use case where an edge application with many sensors 
wants to use a single producer to deliver ‘few but varied’ records with 
different acks settings. The reason for using a single producer is to minimize 
resource usage on edge devices with limited hardware capabilities. Currently, 
we use a producer pool to handle different acks values, which requires 3x 
producer instances. Additionally, this approach creates many idle producers if 
a sensor with a specific acks setting has no data for a while.

I love David’s suggestion since the acks configuration is closely related to 
the topic. Maybe we can introduce an optional configuration in the producer to 
define topic-level acks, with the existing acks being the default for all 
topics. This approach is not only simple but also easy to understand and 
implement.

Best,
Chia-Ping

On 2024/11/13 16:04:24 Andrew Schofield wrote:
> Hi TaiJuWu,
> I've been thinking for a while about this KIP before jumping into the 
> discussion.
> 
> I'm afraid that I don't think the approach in the KIP is the best, given the 
> design
> of the Kafka protocol in this area. Essentially, each Produce request contains
> the acks value at the top level, and may contain records for many topics or
> partitions. My point is that batching occurs at the level of a Produce 
> request,
> so changing the acks value between records will require a new Produce request
> to be sent. There would likely be an efficiency penalty if this feature was 
> used
> heavily with the acks changing record by record.
> 
> I can see that potentially an application might want different ack levels for
> different topics, but I would be surprised if they use different ack levels 
> within
> the same topic. Maybe David's suggestion of defining the acks per topic
> would be enough. What do you think?
> 
> Thanks,
> Andrew
> 
> From: David Jacot 
> Sent: 13 November 2024 15:31
> To: dev@kafka.apache.org 
> Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for producers
> 
> Hi TaiJuWu,
> 
> Thanks for the KIP.
> 
> The motivation is not clear to me. Could you please elaborate a bit more on
> it?
> 
> My concern is that it adds a lot of complexity and the added value seems to
> be low. Moreover, it will make reasoning about an application from the
> server side more difficult because we can no longer assume that it writes
> with the ack based on the config. Another issue is about the batching, how
> do you plan to handle batches mixing records with different acks?
> 
> An alternative approach may be to define the ack per topic. We could even
> think about defining it on the server side as a topic config. I haven't
> really thought about it but it may be something to explore a bit more.
> 
> Best,
> David
> 
> On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
>  wrote:
> 
> > Hi TaiJuWu,
> >
> > I find this adding lot's of complexity and I am still not convinced by the
> > added value. IMO creating a producer instance per ack level is not
> > problematic and the behavior is clear for developers. What would be the
> > added value of the proposed change ?
> >
> > Regards,
> >
> >
> > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu  wrote:
> >
> > > Hi Fred and Greg,
> > >
> > > Thanks for your feedback and it really not straightforward but
> > interesting!
> > > There are some behavior I expect.
> > >
> > > The current producer uses the *RecordAccumulator* to gather records, and
> > > the sender thread sends them in batches. We can track each record’s
> > > acknowledgment setting as it appends to the *RecordAccumulator*, allowing
> > > the *sender *to group batches by acknowledgment levels and topicPartition
> > > when processing.
> > >
> > > Regarding the statement, "Callbacks for records being sent to the same
> > > partition are guaranteed to execute in order," this is ensured when
> > > *max.inflight.request
> > > *is set to 1. We can send records with different acknowledgment levels in
> > > the order of acks-0, acks=1, acks=-1. Since we need to send batches with
> > > different acknowledgment levels batches to the broker, the callback will
> > > execute after each request is completed.
> > >
> > > In response to, "If so, are low-acks records subject to head-of-line
> > > blocking from high-acks records?," I believe an additional configuration
> > is
> > > necessary to control this behavior. We could allow records to be either
> > > sync or async, tho

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-13 Thread Andrew Schofield
Hi TaiJuWu,
I've been thinking for a while about this KIP before jumping into the 
discussion.

I'm afraid that I don't think the approach in the KIP is the best, given the 
design
of the Kafka protocol in this area. Essentially, each Produce request contains
the acks value at the top level, and may contain records for many topics or
partitions. My point is that batching occurs at the level of a Produce request,
so changing the acks value between records will require a new Produce request
to be sent. There would likely be an efficiency penalty if this feature was used
heavily with the acks changing record by record.

I can see that potentially an application might want different ack levels for
different topics, but I would be surprised if they use different ack levels 
within
the same topic. Maybe David's suggestion of defining the acks per topic
would be enough. What do you think?

Thanks,
Andrew

From: David Jacot 
Sent: 13 November 2024 15:31
To: dev@kafka.apache.org 
Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

Hi TaiJuWu,

Thanks for the KIP.

The motivation is not clear to me. Could you please elaborate a bit more on
it?

My concern is that it adds a lot of complexity and the added value seems to
be low. Moreover, it will make reasoning about an application from the
server side more difficult because we can no longer assume that it writes
with the ack based on the config. Another issue is about the batching, how
do you plan to handle batches mixing records with different acks?

An alternative approach may be to define the ack per topic. We could even
think about defining it on the server side as a topic config. I haven't
really thought about it but it may be something to explore a bit more.

Best,
David

On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
 wrote:

> Hi TaiJuWu,
>
> I find this adding lot's of complexity and I am still not convinced by the
> added value. IMO creating a producer instance per ack level is not
> problematic and the behavior is clear for developers. What would be the
> added value of the proposed change ?
>
> Regards,
>
>
> On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu  wrote:
>
> > Hi Fred and Greg,
> >
> > Thanks for your feedback and it really not straightforward but
> interesting!
> > There are some behavior I expect.
> >
> > The current producer uses the *RecordAccumulator* to gather records, and
> > the sender thread sends them in batches. We can track each record’s
> > acknowledgment setting as it appends to the *RecordAccumulator*, allowing
> > the *sender *to group batches by acknowledgment levels and topicPartition
> > when processing.
> >
> > Regarding the statement, "Callbacks for records being sent to the same
> > partition are guaranteed to execute in order," this is ensured when
> > *max.inflight.request
> > *is set to 1. We can send records with different acknowledgment levels in
> > the order of acks-0, acks=1, acks=-1. Since we need to send batches with
> > different acknowledgment levels batches to the broker, the callback will
> > execute after each request is completed.
> >
> > In response to, "If so, are low-acks records subject to head-of-line
> > blocking from high-acks records?," I believe an additional configuration
> is
> > necessary to control this behavior. We could allow records to be either
> > sync or async, though the callback would still execute after each batch
> > with varying acknowledgment levels completes. To measure behavior across
> > acknowledgment levels, we could also include acks in
> *ProducerIntercepor*.
> >
> > Furthermore, before this KIP, a producer could only include one acks
> level
> > so sequence is premised. However, with this change, we can *ONLY*
> guarantee
> > the sequence within records of the same acknowledgment level because we
> may
> > send up to three separate requests to brokers.
> > Best,
> > TaiJuWu
> >
> >
> > TaiJu Wu  於 2024年11月6日 週三 上午10:01寫道:
> >
> > > Hi  Fred and Greg,
> > >
> > > Apologies for the delayed response.
> > > Yes, you’re correct.
> > > I’ll outline the behavior I expect.
> > >
> > > Thanks for your feedback!
> > >
> > > Best,
> > > TaiJuWu
> > >
> > >
> > > Greg Harris  於 2024年11月6日 週三 上午9:48寫道:
> > >
> > >> Hi TaiJuWu,
> > >>
> > >> Thanks for the KIP!
> > >>
> > >> Can you explain in the KIP about the behavior when the number of acks
> is
> > >> different for individual records? I think the current description
&

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-13 Thread David Jacot
Hi TaiJuWu,

Thanks for the KIP.

The motivation is not clear to me. Could you please elaborate a bit more on
it?

My concern is that it adds a lot of complexity and the added value seems to
be low. Moreover, it will make reasoning about an application from the
server side more difficult because we can no longer assume that it writes
with the ack based on the config. Another issue is about the batching, how
do you plan to handle batches mixing records with different acks?

An alternative approach may be to define the ack per topic. We could even
think about defining it on the server side as a topic config. I haven't
really thought about it but it may be something to explore a bit more.

Best,
David

On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
 wrote:

> Hi TaiJuWu,
>
> I find this adding lot's of complexity and I am still not convinced by the
> added value. IMO creating a producer instance per ack level is not
> problematic and the behavior is clear for developers. What would be the
> added value of the proposed change ?
>
> Regards,
>
>
> On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu  wrote:
>
> > Hi Fred and Greg,
> >
> > Thanks for your feedback and it really not straightforward but
> interesting!
> > There are some behavior I expect.
> >
> > The current producer uses the *RecordAccumulator* to gather records, and
> > the sender thread sends them in batches. We can track each record’s
> > acknowledgment setting as it appends to the *RecordAccumulator*, allowing
> > the *sender *to group batches by acknowledgment levels and topicPartition
> > when processing.
> >
> > Regarding the statement, "Callbacks for records being sent to the same
> > partition are guaranteed to execute in order," this is ensured when
> > *max.inflight.request
> > *is set to 1. We can send records with different acknowledgment levels in
> > the order of acks-0, acks=1, acks=-1. Since we need to send batches with
> > different acknowledgment levels batches to the broker, the callback will
> > execute after each request is completed.
> >
> > In response to, "If so, are low-acks records subject to head-of-line
> > blocking from high-acks records?," I believe an additional configuration
> is
> > necessary to control this behavior. We could allow records to be either
> > sync or async, though the callback would still execute after each batch
> > with varying acknowledgment levels completes. To measure behavior across
> > acknowledgment levels, we could also include acks in
> *ProducerIntercepor*.
> >
> > Furthermore, before this KIP, a producer could only include one acks
> level
> > so sequence is premised. However, with this change, we can *ONLY*
> guarantee
> > the sequence within records of the same acknowledgment level because we
> may
> > send up to three separate requests to brokers.
> > Best,
> > TaiJuWu
> >
> >
> > TaiJu Wu  於 2024年11月6日 週三 上午10:01寫道:
> >
> > > Hi  Fred and Greg,
> > >
> > > Apologies for the delayed response.
> > > Yes, you’re correct.
> > > I’ll outline the behavior I expect.
> > >
> > > Thanks for your feedback!
> > >
> > > Best,
> > > TaiJuWu
> > >
> > >
> > > Greg Harris  於 2024年11月6日 週三 上午9:48寫道:
> > >
> > >> Hi TaiJuWu,
> > >>
> > >> Thanks for the KIP!
> > >>
> > >> Can you explain in the KIP about the behavior when the number of acks
> is
> > >> different for individual records? I think the current description
> using
> > >> the
> > >> word "straightforward" does little to explain that, and may actually
> be
> > >> hiding some complexity.
> > >>
> > >> For example, the send() javadoc contains this: "Callbacks for records
> > >> being
> > >> sent to the same partition are guaranteed to execute in order." Is
> this
> > >> still true when acks vary for records within the same partition?
> > >> If so, are low-acks records subject to head-of-line-blocking from
> > >> high-acks
> > >> records? It seems to me that this feature is useful when acks is
> > specified
> > >> per-topic, but introduces a lot of edge cases that are underspecified.
> > >>
> > >> Thanks,
> > >> Greg
> > >>
> > >>
> > >> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu  wrote:
> > >>
> > >> > Hi Chia-Ping,
> > >> >
> > >> > Thanks for your feedback.
> > >> > I have updated KIP based on your suggestions.
> > >> >
> > >> > Best,
> > >> > Stanley
> > >> >
> > >> > Chia-Ping Tsai  於 2024年11月5日 週二 下午4:41寫道:
> > >> >
> > >> > > hi TaiJuWu,
> > >> > >
> > >> > > Q0: Could you please add getter (Short acks()) to "public
> interface"
> > >> > > section?
> > >> > >
> > >> > > Q1: Could you please add RPC json reference to prove "been
> available
> > >> at
> > >> > > the RPC-level,"
> > >> > >
> > >> > > Q2: Could you please add link to producer docs to prove "share a
> > >> single
> > >> > > producer instance across multiple threads"
> > >> > >
> > >> > > Thanks,
> > >> > > Chia-Ping
> > >> > >
> > >> > > On 2024/11/05 01:28:36 吳岱儒 wrote:
> > >> > > > Hi all,
> > >> > > >
> > >> > > > I open a KIP-1107: Adding record-level acks for producers
> > >> > > > <

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-13 Thread Frédérik Rouleau
Hi TaiJuWu,

I find this adding lot's of complexity and I am still not convinced by the
added value. IMO creating a producer instance per ack level is not
problematic and the behavior is clear for developers. What would be the
added value of the proposed change ?

Regards,


On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu  wrote:

> Hi Fred and Greg,
>
> Thanks for your feedback and it really not straightforward but interesting!
> There are some behavior I expect.
>
> The current producer uses the *RecordAccumulator* to gather records, and
> the sender thread sends them in batches. We can track each record’s
> acknowledgment setting as it appends to the *RecordAccumulator*, allowing
> the *sender *to group batches by acknowledgment levels and topicPartition
> when processing.
>
> Regarding the statement, "Callbacks for records being sent to the same
> partition are guaranteed to execute in order," this is ensured when
> *max.inflight.request
> *is set to 1. We can send records with different acknowledgment levels in
> the order of acks-0, acks=1, acks=-1. Since we need to send batches with
> different acknowledgment levels batches to the broker, the callback will
> execute after each request is completed.
>
> In response to, "If so, are low-acks records subject to head-of-line
> blocking from high-acks records?," I believe an additional configuration is
> necessary to control this behavior. We could allow records to be either
> sync or async, though the callback would still execute after each batch
> with varying acknowledgment levels completes. To measure behavior across
> acknowledgment levels, we could also include acks in *ProducerIntercepor*.
>
> Furthermore, before this KIP, a producer could only include one acks level
> so sequence is premised. However, with this change, we can *ONLY* guarantee
> the sequence within records of the same acknowledgment level because we may
> send up to three separate requests to brokers.
> Best,
> TaiJuWu
>
>
> TaiJu Wu  於 2024年11月6日 週三 上午10:01寫道:
>
> > Hi  Fred and Greg,
> >
> > Apologies for the delayed response.
> > Yes, you’re correct.
> > I’ll outline the behavior I expect.
> >
> > Thanks for your feedback!
> >
> > Best,
> > TaiJuWu
> >
> >
> > Greg Harris  於 2024年11月6日 週三 上午9:48寫道:
> >
> >> Hi TaiJuWu,
> >>
> >> Thanks for the KIP!
> >>
> >> Can you explain in the KIP about the behavior when the number of acks is
> >> different for individual records? I think the current description using
> >> the
> >> word "straightforward" does little to explain that, and may actually be
> >> hiding some complexity.
> >>
> >> For example, the send() javadoc contains this: "Callbacks for records
> >> being
> >> sent to the same partition are guaranteed to execute in order." Is this
> >> still true when acks vary for records within the same partition?
> >> If so, are low-acks records subject to head-of-line-blocking from
> >> high-acks
> >> records? It seems to me that this feature is useful when acks is
> specified
> >> per-topic, but introduces a lot of edge cases that are underspecified.
> >>
> >> Thanks,
> >> Greg
> >>
> >>
> >> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu  wrote:
> >>
> >> > Hi Chia-Ping,
> >> >
> >> > Thanks for your feedback.
> >> > I have updated KIP based on your suggestions.
> >> >
> >> > Best,
> >> > Stanley
> >> >
> >> > Chia-Ping Tsai  於 2024年11月5日 週二 下午4:41寫道:
> >> >
> >> > > hi TaiJuWu,
> >> > >
> >> > > Q0: Could you please add getter (Short acks()) to "public interface"
> >> > > section?
> >> > >
> >> > > Q1: Could you please add RPC json reference to prove "been available
> >> at
> >> > > the RPC-level,"
> >> > >
> >> > > Q2: Could you please add link to producer docs to prove "share a
> >> single
> >> > > producer instance across multiple threads"
> >> > >
> >> > > Thanks,
> >> > > Chia-Ping
> >> > >
> >> > > On 2024/11/05 01:28:36 吳岱儒 wrote:
> >> > > > Hi all,
> >> > > >
> >> > > > I open a KIP-1107: Adding record-level acks for producers
> >> > > > <
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> >> > > >
> >> > > > to
> >> > > > reduce the limitation associated with reusing KafkaProducer.
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> >> > > >
> >> > > > Feedbacks and suggestions are welcome.
> >> > > >
> >> > > > Thanks,
> >> > > > TaiJuWu
> >> > > >
> >> > >
> >> >
> >>
> >
>


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-05 Thread TaiJu Wu
Hi Fred and Greg,

Thanks for your feedback and it really not straightforward but interesting!
There are some behavior I expect.

The current producer uses the *RecordAccumulator* to gather records, and
the sender thread sends them in batches. We can track each record’s
acknowledgment setting as it appends to the *RecordAccumulator*, allowing
the *sender *to group batches by acknowledgment levels and topicPartition
when processing.

Regarding the statement, "Callbacks for records being sent to the same
partition are guaranteed to execute in order," this is ensured when
*max.inflight.request
*is set to 1. We can send records with different acknowledgment levels in
the order of acks-0, acks=1, acks=-1. Since we need to send batches with
different acknowledgment levels batches to the broker, the callback will
execute after each request is completed.

In response to, "If so, are low-acks records subject to head-of-line
blocking from high-acks records?," I believe an additional configuration is
necessary to control this behavior. We could allow records to be either
sync or async, though the callback would still execute after each batch
with varying acknowledgment levels completes. To measure behavior across
acknowledgment levels, we could also include acks in *ProducerIntercepor*.

Furthermore, before this KIP, a producer could only include one acks level
so sequence is premised. However, with this change, we can *ONLY* guarantee
the sequence within records of the same acknowledgment level because we may
send up to three separate requests to brokers.
Best,
TaiJuWu


TaiJu Wu  於 2024年11月6日 週三 上午10:01寫道:

> Hi  Fred and Greg,
>
> Apologies for the delayed response.
> Yes, you’re correct.
> I’ll outline the behavior I expect.
>
> Thanks for your feedback!
>
> Best,
> TaiJuWu
>
>
> Greg Harris  於 2024年11月6日 週三 上午9:48寫道:
>
>> Hi TaiJuWu,
>>
>> Thanks for the KIP!
>>
>> Can you explain in the KIP about the behavior when the number of acks is
>> different for individual records? I think the current description using
>> the
>> word "straightforward" does little to explain that, and may actually be
>> hiding some complexity.
>>
>> For example, the send() javadoc contains this: "Callbacks for records
>> being
>> sent to the same partition are guaranteed to execute in order." Is this
>> still true when acks vary for records within the same partition?
>> If so, are low-acks records subject to head-of-line-blocking from
>> high-acks
>> records? It seems to me that this feature is useful when acks is specified
>> per-topic, but introduces a lot of edge cases that are underspecified.
>>
>> Thanks,
>> Greg
>>
>>
>> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu  wrote:
>>
>> > Hi Chia-Ping,
>> >
>> > Thanks for your feedback.
>> > I have updated KIP based on your suggestions.
>> >
>> > Best,
>> > Stanley
>> >
>> > Chia-Ping Tsai  於 2024年11月5日 週二 下午4:41寫道:
>> >
>> > > hi TaiJuWu,
>> > >
>> > > Q0: Could you please add getter (Short acks()) to "public interface"
>> > > section?
>> > >
>> > > Q1: Could you please add RPC json reference to prove "been available
>> at
>> > > the RPC-level,"
>> > >
>> > > Q2: Could you please add link to producer docs to prove "share a
>> single
>> > > producer instance across multiple threads"
>> > >
>> > > Thanks,
>> > > Chia-Ping
>> > >
>> > > On 2024/11/05 01:28:36 吳岱儒 wrote:
>> > > > Hi all,
>> > > >
>> > > > I open a KIP-1107: Adding record-level acks for producers
>> > > > <
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
>> > > >
>> > > > to
>> > > > reduce the limitation associated with reusing KafkaProducer.
>> > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
>> > > >
>> > > > Feedbacks and suggestions are welcome.
>> > > >
>> > > > Thanks,
>> > > > TaiJuWu
>> > > >
>> > >
>> >
>>
>


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-05 Thread TaiJu Wu
Hi  Fred and Greg,

Apologies for the delayed response.
Yes, you’re correct.
I’ll outline the behavior I expect.

Thanks for your feedback!

Best,
TaiJuWu


Greg Harris  於 2024年11月6日 週三 上午9:48寫道:

> Hi TaiJuWu,
>
> Thanks for the KIP!
>
> Can you explain in the KIP about the behavior when the number of acks is
> different for individual records? I think the current description using the
> word "straightforward" does little to explain that, and may actually be
> hiding some complexity.
>
> For example, the send() javadoc contains this: "Callbacks for records being
> sent to the same partition are guaranteed to execute in order." Is this
> still true when acks vary for records within the same partition?
> If so, are low-acks records subject to head-of-line-blocking from high-acks
> records? It seems to me that this feature is useful when acks is specified
> per-topic, but introduces a lot of edge cases that are underspecified.
>
> Thanks,
> Greg
>
>
> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu  wrote:
>
> > Hi Chia-Ping,
> >
> > Thanks for your feedback.
> > I have updated KIP based on your suggestions.
> >
> > Best,
> > Stanley
> >
> > Chia-Ping Tsai  於 2024年11月5日 週二 下午4:41寫道:
> >
> > > hi TaiJuWu,
> > >
> > > Q0: Could you please add getter (Short acks()) to "public interface"
> > > section?
> > >
> > > Q1: Could you please add RPC json reference to prove "been available at
> > > the RPC-level,"
> > >
> > > Q2: Could you please add link to producer docs to prove "share a single
> > > producer instance across multiple threads"
> > >
> > > Thanks,
> > > Chia-Ping
> > >
> > > On 2024/11/05 01:28:36 吳岱儒 wrote:
> > > > Hi all,
> > > >
> > > > I open a KIP-1107: Adding record-level acks for producers
> > > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> > > >
> > > > to
> > > > reduce the limitation associated with reusing KafkaProducer.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> > > >
> > > > Feedbacks and suggestions are welcome.
> > > >
> > > > Thanks,
> > > > TaiJuWu
> > > >
> > >
> >
>


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-05 Thread Greg Harris
Hi TaiJuWu,

Thanks for the KIP!

Can you explain in the KIP about the behavior when the number of acks is
different for individual records? I think the current description using the
word "straightforward" does little to explain that, and may actually be
hiding some complexity.

For example, the send() javadoc contains this: "Callbacks for records being
sent to the same partition are guaranteed to execute in order." Is this
still true when acks vary for records within the same partition?
If so, are low-acks records subject to head-of-line-blocking from high-acks
records? It seems to me that this feature is useful when acks is specified
per-topic, but introduces a lot of edge cases that are underspecified.

Thanks,
Greg


On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu  wrote:

> Hi Chia-Ping,
>
> Thanks for your feedback.
> I have updated KIP based on your suggestions.
>
> Best,
> Stanley
>
> Chia-Ping Tsai  於 2024年11月5日 週二 下午4:41寫道:
>
> > hi TaiJuWu,
> >
> > Q0: Could you please add getter (Short acks()) to "public interface"
> > section?
> >
> > Q1: Could you please add RPC json reference to prove "been available at
> > the RPC-level,"
> >
> > Q2: Could you please add link to producer docs to prove "share a single
> > producer instance across multiple threads"
> >
> > Thanks,
> > Chia-Ping
> >
> > On 2024/11/05 01:28:36 吳岱儒 wrote:
> > > Hi all,
> > >
> > > I open a KIP-1107: Adding record-level acks for producers
> > > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> > >
> > > to
> > > reduce the limitation associated with reusing KafkaProducer.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> > >
> > > Feedbacks and suggestions are welcome.
> > >
> > > Thanks,
> > > TaiJuWu
> > >
> >
>


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-05 Thread TaiJu Wu
Hi Chia-Ping,

Thanks for your feedback.
I have updated KIP based on your suggestions.

Best,
Stanley

Chia-Ping Tsai  於 2024年11月5日 週二 下午4:41寫道:

> hi TaiJuWu,
>
> Q0: Could you please add getter (Short acks()) to "public interface"
> section?
>
> Q1: Could you please add RPC json reference to prove "been available at
> the RPC-level,"
>
> Q2: Could you please add link to producer docs to prove "share a single
> producer instance across multiple threads"
>
> Thanks,
> Chia-Ping
>
> On 2024/11/05 01:28:36 吳岱儒 wrote:
> > Hi all,
> >
> > I open a KIP-1107: Adding record-level acks for producers
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> >
> > to
> > reduce the limitation associated with reusing KafkaProducer.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> >
> > Feedbacks and suggestions are welcome.
> >
> > Thanks,
> > TaiJuWu
> >
>


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-05 Thread Frédérik Rouleau
Sorry, please read:
> LibRdKafka supports acks at the topic level, but then it does *NOT* group
ProduceRequest for different topics in a single request.


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-05 Thread Frédérik Rouleau
Hi TaiJuWu,

What would be the behavior if the application sends several records with
different ack levels? The RPC support acks at the ProduceRequest, so common
for several batches, might even be for different topic-partitions:
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-ProduceAPI

LibRdKafka supports acks at the topic level, but then it does group
ProduceRequest for different topics in a single request. That might lead to
performance drawbacks in some cases (like many different topics used by a
producer).


Regards,
Fred


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-05 Thread Chia-Ping Tsai
hi TaiJuWu,

Q0: Could you please add getter (Short acks()) to "public interface" section?

Q1: Could you please add RPC json reference to prove "been available at the 
RPC-level,"

Q2: Could you please add link to producer docs to prove "share a single 
producer instance across multiple threads"

Thanks,
Chia-Ping

On 2024/11/05 01:28:36 吳岱儒 wrote:
> Hi all,
> 
> I open a KIP-1107: Adding record-level acks for producers
> 
> to
> reduce the limitation associated with reusing KafkaProducer.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> 
> Feedbacks and suggestions are welcome.
> 
> Thanks,
> TaiJuWu
> 


[DISCUSS]KIP-1107: Adding record-level acks for producers

2024-11-04 Thread 吳岱儒
Hi all,

I open a KIP-1107: Adding record-level acks for producers

to
reduce the limitation associated with reusing KafkaProducer.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers

Feedbacks and suggestions are welcome.

Thanks,
TaiJuWu