[jira] [Commented] (CAMEL-14935) KafkaConsumer commits old offset values in a failure scenario causing message replays and offset reset error

Chris McCarthy (Jira) Thu, 30 Apr 2020 22:34:23 -0700


    [ 
https://issues.apache.org/jira/browse/CAMEL-14935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097182#comment-17097182
 ]


Chris McCarthy commented on CAMEL-14935:
----------------------------------------

Thanks Claus,

Yes performing the delete of the map in a finally block would address this 
specific vulnerability, or clearing out the map on assignment,

However looking at the newer versions there seems to be a concept of a 
partition epoch leader introduced in the commit flow, that maintains a map of 
the last seen values per partition and I am trying to determine if this is 
intended to be a general fix for this type of issue. It looks like it is in 
this area.

We may be better off to upgrade to a newer version first I think.    

 

> KafkaConsumer commits old offset values in a failure scenario causing message 
> replays and offset reset error
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-14935
>                 URL: https://issues.apache.org/jira/browse/CAMEL-14935
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-kafka
>    Affects Versions: 2.24.0
>            Reporter: Chris McCarthy
>            Priority: Major
>             Fix For: 3.x
>
>
> We are experiencing unexpected offset reset errors occasionally, as well as 
> occasional replay of messages (without an offset reset error).
> The cause seems to be a failed commit on rebalance, leaving an old value in 
> the hashMap used to store the latest processed offset for a partition. This 
> old value is then re-read and re-committed across rebalances in certain 
> situations.
> Our relevant configuration details are:
> autoCommitEnable=false
>  allowManualCommit=true
>  autoOffsetReset=earliest
> It seems when the KafkaConsumer experiences an Exception committing the 
> offset (CommitFailedException) upon a rebalance, this leaves the old offset 
> value in the lastProcessedOffset hashMap.
> A subsequent rebalance that assigns the same partition to the same consumer, 
> that then thereafter experiences another rebalance (before any messages have 
> been processed successfully as this will over write the invalid value and 
> self correct the problem) will commit this old offset again.  This offset may 
> be very old if there have been many rebalances in between the original 
> rebalance that failed to commit its offset.
> If the old offset is beyond the retention period and the message is no longer 
> available the outcome is an offset reset error.  If the offset is within the 
> retention period all messages are replayed from that offset without an error 
> being thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CAMEL-14935) KafkaConsumer commits old offset values in a failure scenario causing message replays and offset reset error

Reply via email to