[
https://issues.apache.org/jira/browse/KUDU-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Will Berkeley resolved KUDU-1020.
---------------------------------
Resolution: Fixed
Fix Version/s: 1.4.0
Commit e6141a0adf6e3c5a54be8cfdf5acd0f1ff65f714
> ksck with snapshot reports divergence even if a server is just behind
> ---------------------------------------------------------------------
>
> Key: KUDU-1020
> URL: https://issues.apache.org/jira/browse/KUDU-1020
> Project: Kudu
> Issue Type: Improvement
> Components: ksck
> Affects Versions: Private Beta
> Reporter: Todd Lipcon
> Assignee: Will Berkeley
> Fix For: 1.4.0
>
>
> Something seems to be wrong about how ksck handles checksum timestamps. I
> have a recently-restarted cluster, and I ran ksck. One of the tablets has a
> replica which was "lost" -- ie it fell too far behind and therefore could
> never be caught up. ksck is just reporting it as a bad checksum. Shouldn't it
> instead try to wait until the provided timestamp is "safe", and if the wait
> times out, give an error that it's too far behind?
> As a stopgap, maybe we could have ksck also include the latest opid in the
> error printout, to make it more obvious that a server is just "behind" and
> not divergent?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)