Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/11142 to look at the new patch set (#2). Change subject: tablet_bootstrap: adjust mvcc safetime with no-ops ...................................................................... tablet_bootstrap: adjust mvcc safetime with no-ops Previously, during tablet bootstrap, a tablet would only update its MVCC safetime based on write messages, as the timstamps in the write messages are guaranteed to be serialized with respect to one another, by virtue of being assigned in a single thread (the prepare thread) on the leader replica. >From this, we conclude that timestamps for write operations are monotonically increasing in unison with opid. The same cannot necessarily be said for timestamps of no-ops and change configs. This is a conservative conclusion about assigned timestamps, and this patch hinges on the fact that our Raft implementation ensures the following sequence of events: 1. replica A becomes leader of Term N 2. leader A assigns a timestamp t1 to its no-op 3. leader A replicates the no-op to replicas B and C, asserting its leadership for Term N 4. leader A prepares a write and assigns it a timestamp t2. A assigns a higher timestamp than t1, as this step happens after Step 2 5. leader A replicates the write to replicas B and C, checking that it is leader for the current term Given the above series of operations, within the same term, the no-op used to assert leadership is always assigned a timestamp that must be lower than any writes in that term. As such, the timestamps assigned to no-ops can and should be used to bump safetime. This patch updates tablet bootstrap to adjust MVCC safetime based on no-ops seen in the WALs. A test is added asserting that this is true of no-ops with respect to writes. I.e. all replicate messages must have monotonically increasing OpIds and monotonically increasing timestamps. A case in tablet_bootstrap-test depends on the ability for no-ops to written out of timestamp order. To maintain this, and to keep this functionality around (which may be useful for general timestamp assignment testing), a flag has been added to the NoOpRequestPB indicating whether or not its timestamp should be trusted to advance safetime. Additionally, I tweaked the artificial timestamps used in raft_consensus-itest. These timestamps were previously very low and would overlap with the real timestamp used by the leadership no-op. Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f --- M src/kudu/consensus/consensus.proto M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/log_verifier.cc M src/kudu/integration-tests/log_verifier.h M src/kudu/integration-tests/raft_consensus-itest.cc A src/kudu/integration-tests/timestamp_serialization-itest.cc M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_bootstrap.cc 8 files changed, 241 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11142/2 -- To view, visit http://gerrit.cloudera.org:8080/11142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f Gerrit-Change-Number: 11142 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins