Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17561 )

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
......................................................................

KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

In performing a diff scan, Kudu iterates in small batches of rows,
selecting deltas associated with each row that are relevant to the
scan's timestamp bounds. Once all selected deltas are collected for a
given row, the oldest and newest deltas are found by a sorting criteria
meant to sort deltas by their application order:

1. Deltas of lower timestamps are less than deltas of higher timestmaps.
2. UNDO deltas are less than REDO deltas.
3. Within each delta store's iterator, a counter of selected deltas is
   used to disambiguate between rows that have the same timestamp. A
   critical assumption here is that this disambiguator is only used for
   deltas of the same delta store. For REDOs, a lower counter implies
   a lower application order -- the opposite is true for UNDOs.

What the above criteria don't account for is the fact that certain
iterators can iterate over separate delta stores that have deltas of the
same timestamp and type. If Kudu delta flushes while applying a large
batch of updates to the same row, the result is that some of the batch's
updates can land in the newly flushed REDO delta file, while the rest
land in the new DMS. In iterating over these stores with a
DeltaIteratorMerger, which combines the deltas of several delta stores,
this breaks the assumption described above, resulting in the crash
reported in KUDU-3291.

To remediate this, in iterators that merge multiple delta stores,
namely, the DeltaIteratorMerger, a single top-level counter is used to
guide the disambiguators generated by each sub-iterator. Before
iterating over a new batch of deltas in a given sub-iterator, this
counter is propagated to the sub-iterators as the new starting point of
its counter. The result is that the disambiguators generated by the
DeltaIteratorMerger can be used to define a total ordering of the deltas
selected.

Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Reviewed-on: http://gerrit.cloudera.org:8080/17547
Reviewed-by: Alexey Serbin <aser...@cloudera.com>
Tested-by: Andrew Wong <aw...@cloudera.com>
Reviewed-by: Grant Henke <granthe...@apache.org>
(cherry picked from commit 9ecee7ba065f1f7c844f1afd7136ab0565ce2340)
Reviewed-on: http://gerrit.cloudera.org:8080/17561
Reviewed-by: Bankim Bhavsar <ban...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/diff_scan-test.cc
M src/kudu/tablet/tablet-test-base.h
8 files changed, 157 insertions(+), 32 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Bankim Bhavsar: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.15.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17561
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ban...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

Reply via email to