[jira] [Comment Edited] (CAMEL-14935) KafkaConsumer commits old offset values in a failure scenario causing message replays and offset reset error

Chris McCarthy (Jira) Sat, 30 May 2020 04:03:56 -0700


    [ 
https://issues.apache.org/jira/browse/CAMEL-14935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120207#comment-17120207
 ]


Chris McCarthy edited comment on CAMEL-14935 at 5/30/20, 11:02 AM:
-------------------------------------------------------------------

[~dariusx] Yes that's what I was thinking.  I would be inclined to throw the 
exception as there is useful information in the default handler that prints the 
below which is useful I think.  Also a useful addition may be if you logged the 
offsetkey and the offset as it may be helpful for diagnostics when searching 
the log tracing issues with a particular partition or offset.  But nice one as 
this is the fix needed!

"org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be 
completed since the group has already rebalanced and assigned the partitions to 
another member. This means that the time between subsequent calls to poll() was 
longer than the configured max.poll.interval.ms, which typically implies that 
the poll loop is spending too much time message processing. You can address 
this either by increasing max.poll.interval.ms or by reducing the maximum size 
of batches returned in poll() with max.poll.records."


was (Author: chris mccarthy):
[~dariusx] Yes hats basically it I think.  I would be inclined to throw the 
exception as there is useful information in the default handler that prints 
this:

> KafkaConsumer commits old offset values in a failure scenario causing message 
> replays and offset reset error
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-14935
>                 URL: https://issues.apache.org/jira/browse/CAMEL-14935
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-kafka
>    Affects Versions: 2.24.0
>            Reporter: Chris McCarthy
>            Priority: Major
>             Fix For: 3.x
>
>
> We are experiencing unexpected offset reset errors occasionally, as well as 
> occasional replay of messages (without an offset reset error).
> The cause seems to be a failed commit on rebalance, leaving an old value in 
> the hashMap used to store the latest processed offset for a partition. This 
> old value is then re-read and re-committed across rebalances in certain 
> situations.
> Our relevant configuration details are:
> autoCommitEnable=false
>  allowManualCommit=true
>  autoOffsetReset=earliest
> It seems when the KafkaConsumer experiences an Exception committing the 
> offset (CommitFailedException) upon a rebalance, this leaves the old offset 
> value in the lastProcessedOffset hashMap.
> A subsequent rebalance that assigns the same partition to the same consumer, 
> that then thereafter experiences another rebalance (before any messages have 
> been processed successfully as this will over write the invalid value and 
> self correct the problem) will commit this old offset again.  This offset may 
> be very old if there have been many rebalances in between the original 
> rebalance that failed to commit its offset.
> If the old offset is beyond the retention period and the message is no longer 
> available the outcome is an offset reset error.  If the offset is within the 
> retention period all messages are replayed from that offset without an error 
> being thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CAMEL-14935) KafkaConsumer commits old offset values in a failure scenario causing message replays and offset reset error

Reply via email to