[
https://issues.apache.org/jira/browse/KAFKA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Gustafson resolved KAFKA-6975.
------------------------------------
Resolution: Fixed
Fix Version/s: (was: 1.1.1)
(was: 1.0.2)
> AdminClient.deleteRecords() may cause replicas unable to fetch from beginning
> -----------------------------------------------------------------------------
>
> Key: KAFKA-6975
> URL: https://issues.apache.org/jira/browse/KAFKA-6975
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 1.1.0, 1.0.1
> Reporter: Anna Povzner
> Assignee: Anna Povzner
> Priority: Blocker
> Fix For: 2.0.0
>
>
> AdminClient.deleteRecords(beforeOffset(offset)) will set log start offset to
> the requested offset. If the requested offset is in the middle of the batch,
> the replica will not be able to fetch from that offset (because it is in the
> middle of the batch).
> One use-case where this could cause problems is replica re-assignment.
> Suppose we have a topic partition with 3 initial replicas, and at some point
> the user issues AdminClient.deleteRecords() for the offset that falls in the
> middle of the batch. It now becomes log start offset for this topic
> partition. Suppose at some later time, the user starts partition
> re-assignment to 3 new replicas. The new replicas (followers) will start with
> HW = 0, will try to fetch from 0, then get "out of order offset" because 0 <
> log start offset (LSO); the follower will be able to reset offset to LSO of
> the leader and fetch LSO; the leader will send a batch in response with base
> offset <LSO, this will cause "out of order offset" on the follower which will
> stop the fetcher thread. The end result is that the new replicas will not be
> able to start fetching unless LSO moves to an offset that is not in the
> middle of the batch, and the re-assignment will be stuck for a possibly a
> very log time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)