Hello David Ribeiro Alves, Mike Percy,
I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/3073
to review the following change.
Change subject: KUDU-1131. Avoid CHECK failure during compaction when an
operation is slow to commit
......................................................................
KUDU-1131. Avoid CHECK failure during compaction when an operation is slow to
commit
This fixes a CHECK failure in the following case:
- an operation with txid 5 is replicated, but not yet applying
- an operation with txid 6 is replicated, and starts to apply
- the tablet flushes. We wait for txid 6 to commit, since it was already
applying when the flush snapshot was taken.
-- the resulting UNDO file now includes an UNDO at txid 6, so its
max_timestamp is 6
- the tablet issues a compaction, before txid 5 has committed
This would trigger a CHECK because we see that the current snapshot has txid 5
as uncommitted, but there is an UNDO delta with a max txid of 6. With the code
before this patch, that would have erroneously made the compaction code thing
the UNDO file was actually a REDO file, since its time range overlapped the
current snapshot.
To fix this, I added a new call to DeltaTracker to specifically fetch UNDO or
REDO delta files, rather than relying on the time range and snapshot to do so.
The CHECK can now safely be removed.
The original commit that added this CHECK was
1a6b80a310a7de3519d78a2f5e90ecaae1cf405a.
The commit message there mentions that linked_list-test was modified at that
point
to act as a regression test for the original bug. I ran that test 500 times and
they all passed:
http://dist-test.cloudera.org//job?job_id=todd.1463187608.28912
I also ran mt-tablet-test "DoTestAllAtOnce" 4000 times. This test was flaky (1
or 2 failures out of 4000) prior to this change and now passed:
http://dist-test.cloudera.org/job?job_id=todd.1463186765.27185
Change-Id: Ie16f2c6d190a322c107d60312d4c35d7aa409c43
---
M src/kudu/tablet/compaction.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
3 files changed, 40 insertions(+), 54 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/3073/1
--
To view, visit http://gerrit.cloudera.org:8080/3073
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie16f2c6d190a322c107d60312d4c35d7aa409c43
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Mike Percy <[email protected]>