Todd Lipcon has posted comments on this change.

Change subject: WIP KUDU-1127 Don't hang scanner threads waiting for safe time

Patch Set 1:


seems like a reasonable heuristic.

The only kinda funny thing is that in the normal case, where the safetime moves 
stepwise every raft heartbeat (eg once every 500ms or once a second), then the 
heuristic has the opposite effect from desired. In other words, just after the 
safetime has been updated, we won't reject anything (even though that's 
precisely the time when the next update is farthest off).

Put another way, there are basically two "modes" to worry about. In the 
lagging/abandoned mode, the current time gets farther and farther ahead of the 
safetime, and thus it's reasonable to assume "the longer we have been 
abandoned, the more likely we are to be abandoned for a longer time". In the 
non-failure mode, it's the opposite "the longer we've been waiting for a 
heartbeat, the more likely it is that our next heartbeat is about to arrive".

I don't know if it's worth trying to adjust for this based on knowledge of the 
raft heartbeat interval or empirical knowledge of the timing between the 
(n-2th) safetime update and the (n-1th) update, but maybe worth a note in the 
code about this weird effect?
Commit Message:

Line 15: This allowed to swap linked_list-test to finish with snapshot scans
why not merge the test change in, so this goes in with its end-to-end test 
File src/kudu/consensus/

PS1, Line 133: LOG(WARNING) 
probably worth throttling this, otherwise a server that got abandoned might 
spew these warnings

PS1, Line 135: deadline
remaining time budget? remaining timeout?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7cd0b0749e715c5d9e665a8e37d0f1c95af574e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves <>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <>
Gerrit-HasComments: Yes

Reply via email to