Thanks Neeraj,

>From reading code, I am reasonably certain that no data loss occurred - the
producer reset the batch sequence numbers, and then tried again.

I refer you to this comment in the code of the producer's Sender:

                // tell the user the result of their request. We only
adjust sequence numbers if the batch didn't exhaust
                // its retries -- if it did, we don't know whether the
sequence number was accepted or not, and
                // thus it is not safe to reassign the sequence.

So yeah, the fact that you see it tweaking sequence numbers is because the
batch hasn't exhausted its retry limit, so the sequence number is adjusted,
and then sending the batch is tried again.

That said, if you want to be sure, and can replicate this issue, you could
perhaps modify the code and use the producer send method overload that
accepts a callback which advises of success or failure when a batch is
completed successfully or fatally, and then log any exceptions occurring
there.

Disclaimer - I'm unfamiliar with these internals of producers, so my
reading of the code could be wrong, so I strongly welcome correction from
other mailing list members if that's the case.

Cheers,

Liam Clarke


On Mon, 4 Apr 2022 at 16:29, Neeraj Vaidya
<neeraj.vai...@yahoo.co.in.invalid> wrote:

>  Hi Liam,
> Brokers are on Apache Kafka v2.7.0
> However, the Producer client is using the v2.6 libraries.
>
> Regards,
> Neeraj     On Monday, 4 April, 2022, 02:17:42 pm GMT+10, Liam
> Clarke-Hutchinson <lclar...@redhat.com> wrote:
>
>  Hi Neeraj,
>
> Not sure just yet, I'm diving into the code to find out. Oh, what version
> Kafka are you running please?
>
> Cheers,
>
> Liam
>
> On Mon, 4 Apr 2022 at 14:50, Neeraj Vaidya
> <neeraj.vai...@yahoo.co.in.invalid> wrote:
>
> >  Hi Liam,
> > Thanks for getting back.
> >
> > 1) Producer settings ( I am guessing these are the ones you are
> interested
> > in)
> > enable.idempotence=true
> > max.in.flight.requests.per.connection=5
> >
> > 2) Sample broker logs corresponding to the timestamp in the application
> > logs of the Producer
> >
> > [2022-04-03 15:56:39,587] ERROR [ReplicaManager broker=5] Error
> processing
> > append operation on partition input-topic-114
> (kafka.server.ReplicaManager)
> > org.apache.kafka.common.errors.OutOfOrderSequenceException: Invalid
> > sequence number for new epoch at offset 967756 in partition
> > input-topic-114: 158 (request epoch), 3 (seq. number)
> >
> > Do the producer errors indicate that these messages never made it to the
> > Kafka topic at all ?
> >
> > Regards,
> > Neeraj
> >      On Monday, 4 April, 2022, 12:23:30 pm GMT+10, Liam
> Clarke-Hutchinson <
> > lclar...@redhat.com> wrote:
> >
> >  Hi Neeraj,
> >
> > First off, what are your producer settings?
> > Secondly, do you have brokers logs for the leaders of some of your
> affected
> > topics on hand at all?
> >
> > Cheers,
> >
> > Liam Clarke-Hutchinson
> >
> > On Mon, 4 Apr 2022 at 14:04, Neeraj Vaidya
> > <neeraj.vai...@yahoo.co.in.invalid> wrote:
> >
> > > Hi All,
> > > For one of the Kafka producers that I have, I see that the Producer
> > Record
> > > Error rate is non-zero i.e. out of the expected 3000 messages per
> second
> > > which I a expect to be producing to the topic, I can see that this
> metric
> > > shows a rate of about 200.
> > > Does this indicate that the records failed to be sent to the Kafka
> topic,
> > > or does this metric show up even for each retry in the Producer.Send
> > > operation ?
> > >
> > > Notes :
> > > 1) I have distributed  8 brokers equally across 2 sites. Using
> > > rack-awareness, I am making Kafka position replicas equally across both
> > > sites. My min.isr=2 and replication factor = 4. This makes 2 replicas
> to
> > be
> > > located in each site.
> > > 2) The scenario I am testing is that of shutting down a set of 4
> brokers
> > > in one site (out of 8) for an extended period of time and then bringing
> > > them back up after say 2 hours. This causes the the follower replicas
> on
> > > those brokers to try and catch-up with the leader replicas on the other
> > > brokers. The error rate that I am referring to shows up under this
> > scenario
> > > of restarting the brokers. It does not show up when I have just the
> other
> > > set of (4) brokers.
> > >
> > > To be specific, here are the errors that I see in the Kafka producer
> log
> > > file:
> > >
> > > 2022-04-03 15:56:39.613  WARN --- [-thread | producer-1]
> > > o.a.k.c.p.i.Sender                      : [Producer
> clientId=producer-1]
> > > Got error produce response with correlation id 16512434 on
> > topic-partition
> > > input-topic-114, retrying (2147483646 attempts left). Error:
> > > OUT_OF_ORDER_SEQUENCE_NUMBER
> > > 2022-04-03 15:56:39.613  WARN --- [-thread | producer-1]
> > > o.a.k.c.p.i.Sender                      : [Producer
> clientId=producer-1]
> > > Got error produce response with correlation id 16512434 on
> > topic-partition
> > > input-topic-58, retrying (2147483646 attempts left). Error:
> > > OUT_OF_ORDER_SEQUENCE_NUMBER
> > > 2022-04-03 15:56:39.613  INFO --- [-thread | producer-1]
> > > o.a.k.c.p.i.TransactionManager          : [Producer
> clientId=producer-1]
> > > ProducerId set to 2040 with epoch 159
> > > 2022-04-03 15:56:39.613  INFO --- [-thread | producer-1]
> > > o.a.k.c.p.i.ProducerBatch                : Resetting sequence number of
> > > batch with current sequence 3 for partition input-topic-114 to 0
> > > 2022-04-03 15:56:39.613  INFO --- [-thread | producer-1]
> > > o.a.k.c.p.i.ProducerBatch                : Resetting sequence number of
> > > batch with current sequence 5 for partition input-topic-114 to 2
> > > 2022-04-03 15:56:39.613  INFO --- [-thread | producer-1]
> > > o.a.k.c.p.i.ProducerBatch                : Resetting sequence number of
> > > batch with current sequence 6 for partition input-topic-114 to 3
> > > 2022-04-03 15:56:39.613  INFO --- [-thread | producer-1]
> > > o.a.k.c.p.i.ProducerBatch                : Resetting sequence number of
> > > batch with current sequence 1 for partition input-topic-58 to 0
> > > 2022-04-03 15:56:39.739  WARN --- [-thread | producer-1]
> > > o.a.k.c.p.i.Sender                      : [Producer
> clientId=producer-1]
> > > Got error produce response with correlation id 16512436 on
> > topic-partition
> > > input-topic-82, retrying (2147483646 attempts left). Error:
> > > OUT_OF_ORDER_SEQUENCE_NUMBER
> > >
> > > Regards,
> > > Neeraj
> > >
> > >
> >
>

Reply via email to