Vincent Jiang created KAFKA-14347:
-------------------------------------
Summary: deleted records may be kept unexpectedly when leader
changes while adding a new replica
Key: KAFKA-14347
URL: https://issues.apache.org/jira/browse/KAFKA-14347
Project: Kafka
Issue Type: Improvement
Reporter: Vincent Jiang
Consider that in a compacted topic, a regular record _k1=v1_ is deleted by a
later tombstone record {_}k1=null{_}{_}.{_} And imagine that somehow __ log
compaction is making different progress on the three replicas, {_}r1{_}, _r2_
and _r3:_
_-_ on replica {_}r1{_}, log compaction has not cleaned _k1=v1_ or _k1=null_
yet.
- on replica {_}r2{_}, log compaction cleaned and removed both _k1=v1_ and
_k1=null._
In this case, following sequence can cause record _k1=v1_ being kept
unexpectedly:
1. Replica _r3_ is re-assigned to a different node and starts to replicate
data from leader.
2. At the beginning, _r1_ is the leader, so _r3_ replicates record _k1=v1_ from
{_}r1{_}.
3. Before _k1=null_ is replicated from {_}r1{_}, leader changes to {_}r2{_}.
4. _r3_ replicates data from {_}r2{_}. Because _k1=null_ record has been
cleaned in {_}r2{_}, it will not be replicated.
As a result, _r3_ has record _k1=v1_ but not {_}k1=null{_}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)