[ 
https://issues.apache.org/jira/browse/KAFKA-12181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263610#comment-17263610
 ] 

Jason Gustafson commented on KAFKA-12181:
-----------------------------------------

[~jagsancio] I'm not sure it's necessary to preserve the high watermark after 
transitioning to leader. Currently the leader will not expose the new high 
watermark until it has reached the LeaderChange message that is written at the 
start of the leader epoch. However, there are a few improvements I have been 
thinking about here given the fact that the leader propagates its high 
watermark to followers:

1. If a voter knows it has not reached the last reported high watermark from 
the leader, then it can refuse to become a candidate.
2. If a voter receives a vote request from a candidate whose offset has not 
reached the last reported high watermark from the leader, it can reject the 
vote.

These conditions are strictly stronger than what Raft requires. This would 
require that voters track the "global high watermark," which is a little 
different than what they do today. I will open a separate issue to give this 
some more thought. I have also been thinking about how to handle disaster 
scenarios where we might have to relax some of the standard validations.


> Loosen monotonic fetch offset validation by raft leader
> -------------------------------------------------------
>
>                 Key: KAFKA-12181
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12181
>             Project: Kafka
>          Issue Type: Sub-task
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Major
>
> Currently in the Raft's leader implementation, we validate that follower 
> fetch offsets increase monotonically. This protects the guarantees that Raft 
> provides since a non-monotonic update means that the follower has lost 
> committed data, which may or may not result in data loss. It depends whether 
> the update also causes a non-monotonic update to the high watermark. If the 
> fetch is from an observer, no harm done since observers do not affect the 
> high watermark. If the fetch is from a voter and a majority of nodes 
> (excluding the fetcher) have offsets larger than or equal to the high 
> watermark, also no harm done. It's easy to check for these cases and log a 
> warning instead of raising an error.
> The question then is what to do if we get a voter fetch which does cause the 
> high watermark to regress? The problem is that there are some scenarios where 
> data loss might be unavoidable. For example, a follower's disk might become 
> corrupt and ultimately get replaced. Often the damage is already done by the 
> time we get the Fetch request with the non-monotonic offset, so the stricter 
> validation in fact just prevents recovery. 
> It's worth noting also that the stricter validation by the leader cannot be 
> relied on to detect data loss. It could be the case that a recovered voter 
> restarts in the middle of an election. There is no general way that I'm aware 
> of that lets us detect when a voter has lost previously committed data.
> With all of this mind, my conclusion is that it makes sense to loosen the 
> validation in fetches. The leader can still ensure that its high watermark 
> does not go backwards and we can still log a warning, but it should not 
> prevent replicas from catching up after hard failures with disk state loss. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to