Todd Lipcon created KUDU-1354:
---------------------------------
Summary: MVCC Snapshots chosen during flush can contain
out-of-order transactions
Key: KUDU-1354
URL: https://issues.apache.org/jira/browse/KUDU-1354
Project: Kudu
Issue Type: Bug
Components: tablet
Affects Versions: 0.7.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
I spent a while trying to debug a failure of alter_table-randomized-test and
found the following interesting logs:
- We have two operations in the WAL which arrived in short succession (about
4ms apart) just before an alter table. I've renumbered the txids for
readability here:
{noformat}
1.13@2 REPLICATE WRITE_OP
op 0: MUTATE (int32 key=1643562) SET c6=1107303203
1.14@4 REPLICATE WRITE_OP
op 0: MUTATE (int32 key=1643562) DELETE
{noformat}
- and the Flush that was caused by the Altertable has the following snapshots:
{noformat}
... Phase 1 snapshot: MvccSnapshot[committed={T|T < 2 or (T in (4))]
...
... Phase 2 snapshot: MvccSnapshot[committed={T|T < 2 or (T in (4, 2))]
{noformat}
Note that the first snapshot considers the 'DELETE' committed but not the
'UPDATE'. We then fill in the 'UPDATE' in the second snapshot.The end result
here is that we end up flushing REDO deltas as follows:
REDO file 1 (flushed in phase 1): includes only the DELETE
REDO file 2 (flushed after ReupdateMissedDeltas); includes only the UPDATE
When we later proceed to compact this rowset, we get "Check failed: !is_deleted
Got UPDATE for deleted row."
Scenarios like this seem to reproduce a few tenths of a percent of the time in
this stress test.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)