[
https://issues.apache.org/jira/browse/KAFKA-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992967#comment-14992967
]
Jiangjie Qin commented on KAFKA-2759:
-------------------------------------
[~ewencp] Would it be simpler we just set the auto reset to smallest?
> Mirror maker can leave gaps of missing messages if the process dies after a
> partition is reset and before the first offset commit
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-2759
> URL: https://issues.apache.org/jira/browse/KAFKA-2759
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8.2.2
> Reporter: Ewen Cheslack-Postava
> Priority: Minor
>
> Based on investigation of KAFKA-2747. When mirror maker first starts or if it
> picks up new topics/partitions, it will use the reset policy to choose where
> to start. By default this uses 'latest'. If it starts reading messages and
> then dies before committing offsets for the first time, then the mirror maker
> that takes over that partition will also reset. This can result in some
> messages making it to the consumer, then a gap that were skipped, and then
> messages that get processed by the new MM process.
> One solution to this problem would be to make sure that offsets are committed
> after they are reset but before the first message is passed to the producer.
> In other words, in the case of a reset, MM should record where it's going to
> start reading data from before processing any messages. This guarantees all
> messages after the first one delivered by MM will appear at least once.
> This is minor since it should be very rare, but it does break an assumption
> that people probably make about the output -- once you start receiving data,
> you aren't guaranteed all subsequent messages will appear at least once.
> This same issue could affect Copycat as well. In fact, it may be generally
> useful to allow consumers to know when the offset was reset so they can
> handle cases like this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)