[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-08 Thread Andrew Wong (Code Review)
Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17547 )

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..

KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

In performing a diff scan, Kudu iterates in small batches of rows,
selecting deltas associated with each row that are relevant to the
scan's timestamp bounds. Once all selected deltas are collected for a
given row, the oldest and newest deltas are found by a sorting criteria
meant to sort deltas by their application order:

1. Deltas of lower timestamps are less than deltas of higher timestmaps.
2. UNDO deltas are less than REDO deltas.
3. Within each delta store's iterator, a counter of selected deltas is
   used to disambiguate between rows that have the same timestamp. A
   critical assumption here is that this disambiguator is only used for
   deltas of the same delta store. For REDOs, a lower counter implies
   a lower application order -- the opposite is true for UNDOs.

What the above criteria don't account for is the fact that certain
iterators can iterate over separate delta stores that have deltas of the
same timestamp and type. If Kudu delta flushes while applying a large
batch of updates to the same row, the result is that some of the batch's
updates can land in the newly flushed REDO delta file, while the rest
land in the new DMS. In iterating over these stores with a
DeltaIteratorMerger, which combines the deltas of several delta stores,
this breaks the assumption described above, resulting in the crash
reported in KUDU-3291.

To remediate this, in iterators that merge multiple delta stores,
namely, the DeltaIteratorMerger, a single top-level counter is used to
guide the disambiguators generated by each sub-iterator. Before
iterating over a new batch of deltas in a given sub-iterator, this
counter is propagated to the sub-iterators as the new starting point of
its counter. The result is that the disambiguators generated by the
DeltaIteratorMerger can be used to define a total ordering of the deltas
selected.

Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Reviewed-on: http://gerrit.cloudera.org:8080/17547
Reviewed-by: Alexey Serbin 
Tested-by: Andrew Wong 
Reviewed-by: Grant Henke 
---
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/diff_scan-test.cc
M src/kudu/tablet/tablet-test-base.h
8 files changed, 157 insertions(+), 32 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Andrew Wong: Verified
  Grant Henke: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-08 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17547 )

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Jun 2021 14:45:14 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17547 )

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..


Patch Set 4: Verified+1

Test failure looks unrelated: 
ClientTest.TxnKeepAliveAndUnavailableTxnManagerLongTime


--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Jun 2021 03:38:00 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-07 Thread Andrew Wong (Code Review)
Andrew Wong has removed a vote on this change.

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17547 )

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Jun 2021 01:08:27 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-07 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Grant Henke, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17547

to look at the new patch set (#4).

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..

KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

In performing a diff scan, Kudu iterates in small batches of rows,
selecting deltas associated with each row that are relevant to the
scan's timestamp bounds. Once all selected deltas are collected for a
given row, the oldest and newest deltas are found by a sorting criteria
meant to sort deltas by their application order:

1. Deltas of lower timestamps are less than deltas of higher timestmaps.
2. UNDO deltas are less than REDO deltas.
3. Within each delta store's iterator, a counter of selected deltas is
   used to disambiguate between rows that have the same timestamp. A
   critical assumption here is that this disambiguator is only used for
   deltas of the same delta store. For REDOs, a lower counter implies
   a lower application order -- the opposite is true for UNDOs.

What the above criteria don't account for is the fact that certain
iterators can iterate over separate delta stores that have deltas of the
same timestamp and type. If Kudu delta flushes while applying a large
batch of updates to the same row, the result is that some of the batch's
updates can land in the newly flushed REDO delta file, while the rest
land in the new DMS. In iterating over these stores with a
DeltaIteratorMerger, which combines the deltas of several delta stores,
this breaks the assumption described above, resulting in the crash
reported in KUDU-3291.

To remediate this, in iterators that merge multiple delta stores,
namely, the DeltaIteratorMerger, a single top-level counter is used to
guide the disambiguators generated by each sub-iterator. Before
iterating over a new batch of deltas in a given sub-iterator, this
counter is propagated to the sub-iterators as the new starting point of
its counter. The result is that the disambiguators generated by the
DeltaIteratorMerger can be used to define a total ordering of the deltas
selected.

Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
---
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/diff_scan-test.cc
M src/kudu/tablet/tablet-test-base.h
8 files changed, 157 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/17547/4
--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17547 )

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/17547/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17547/3//COMMIT_MSG@36
PS3, Line 36: this counter is
: propagated to the sub-iterators
> Just to make sure I understand this correctly: is there any guarantee that
Yeah, right now, each sub-iterator only keeps track of the deltas within the 
delta store being iterated over. In that sense, the disambiguators assigned by 
each sub-iterator are correct only for that delta store.


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h@144
PS3, Line 144: a.disambiguator != b.disambiguator
> nit: wrap into PREDICT_TRUE()?
Done


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h@147
PS3, Line 147: "Could not differentiate between two deltas";
> Not a part of this changelist, but does it make sense to drump information
Done


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h@473
PS3, Line 473: DCHECK_GE
> Can we put DCHECK_GT() here?  Or it can be that no deltas are selected in a
It is possible that no deltas are selected in the sub-iterators.


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc
File src/kudu/tablet/diff_scan-test.cc:

http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@228
PS3, Line 228: &
> nit: maybe, store it by value?
Done


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@233
PS3, Line 233: ASSERT_OK
> nit: could it ever fail?  If yes, then it'd necessary to join the thread si
Done


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@243
PS3, Line 243:   opts.order = ORDERED;;
 :   opts.include_deleted_rows = true;
> How crucial are these custom settings for the reproduction of the issue?  I
Done


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@246
PS3, Line 246: static const bool
> nit: as for 'kRowKey', could use constexpr here as well (just for uniformit
This is removed.



--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Jun 2021 00:36:20 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17547 )

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..


Patch Set 3: Code-Review+1

(8 comments)

LGTM overall, just a couple of questions (just to make sure my understanding is 
correct) and nits.

http://gerrit.cloudera.org:8080/#/c/17547/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17547/3//COMMIT_MSG@36
PS3, Line 36: this counter is
: propagated to the sub-iterators
Just to make sure I understand this correctly: is there any guarantee that 
those counters would be different for sub-iterators?  Is that because every 
selected delta increases the count?


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h@144
PS3, Line 144: a.disambiguator != b.disambiguator
nit: wrap into PREDICT_TRUE()?


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h@147
PS3, Line 147: "Could not differentiate between two deltas";
Not a part of this changelist, but does it make sense to drump information on 
deltas into the fatal error message?


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/delta_store.h@473
PS3, Line 473: DCHECK_GE
Can we put DCHECK_GT() here?  Or it can be that no deltas are selected in a 
sub-iterator?


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc
File src/kudu/tablet/diff_scan-test.cc:

http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@228
PS3, Line 228: &
nit: maybe, store it by value?


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@233
PS3, Line 233: ASSERT_OK
nit: could it ever fail?  If yes, then it'd necessary to join the thread since 
line 234 wouldn't be reachable


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@243
PS3, Line 243:   opts.order = ORDERED;;
 :   opts.include_deleted_rows = true;
How crucial are these custom settings for the reproduction of the issue?  It 
would be nice to add a comment about that.


http://gerrit.cloudera.org:8080/#/c/17547/3/src/kudu/tablet/diff_scan-test.cc@246
PS3, Line 246: static const bool
nit: as for 'kRowKey', could use constexpr here as well (just for uniformity)



--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 07 Jun 2021 20:17:42 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-06 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Grant Henke, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17547

to look at the new patch set (#3).

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..

KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

In performing a diff scan, Kudu iterates in small batches of rows,
selecting deltas associated with each row that are relevant to the
scan's timestamp bounds. Once all selected deltas are collected for a
given row, the oldest and newest deltas are found by a sorting criteria
meant to sort deltas by their application order:

1. Deltas of lower timestamps are less than deltas of higher timestmaps.
2. UNDO deltas are less than REDO deltas.
3. Within each delta store's iterator, a counter of selected deltas is
   used to disambiguate between rows that have the same timestamp. A
   critical assumption here is that this disambiguator is only used for
   deltas of the same delta store. For REDOs, a lower counter implies
   a lower application order -- the opposite is true for UNDOs.

What the above criteria don't account for is the fact that certain
iterators can iterate over separate delta stores that have deltas of the
same timestamp and type. If Kudu delta flushes while applying a large
batch of updates to the same row, the result is that some of the batch's
update can land in the newly flushed REDO delta file, while the rest
land in the new DMS. In iterating over these stores with a
DeltaIteratorMerger, which combines the deltas of several delta stores,
this breaks the assumption described above, resulting in the crash
reported in KUDU-3291.

To remediate this, in iterators that merge multiple delta stores,
namely, the DeltaIteratorMerger, a single top-level counter is used to
define the disambiguators generated by sub-iterator. Before iterating
over a new batch of deltas in a given sub-iterator, this counter is
propagated to the sub-iterators as the new starting point of its
counter. The result is that the disambiguators generated by the
DeltaIteratorMerger can be used to define a total ordering of the deltas
selected.

Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
---
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/diff_scan-test.cc
M src/kudu/tablet/tablet-test-base.h
8 files changed, 152 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/17547/3
--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-06 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Grant Henke, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17547

to look at the new patch set (#2).

Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..

KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

In performing a diff scan, Kudu iterates in small batches of rows,
selecting deltas associated with each row that are relevant to the
scan's timestamp bounds. Once all selected deltas are collected for a
given row, the oldest and newest deltas are found by a sorting criteria
meant to sort deltas by their application order:
1. Deltas of lower timestamps are less than deltas of higher timestmaps.
2. UNDO deltas are less than REDO deltas.
3. Within each delta store's iterator, a counter of selected deltas is
   used to disambiguate between rows that have the same timestamp. A
   critical assumption here is that this disambiguator is only used for
   deltas of the same delta store. For REDOs, a lower counter implies
   a lesser application order -- the opposite is true for UNDOs.

What the above criteria don't account for is the fact that certain
iterators can iterate over separate delta stores have deltas of the same
timestamp and type. If Kudu delta flushes while applying a large batch
of updates to the same row, the result is that some of the batch's
update can land in the newly flushed REDO delta file, while the rest
land in the new DMS. In iterating over these stores with a
DeltaIteratorMerger, which combines the deltas of several delta stores,
this breaks the assumption described above, resulting in the crash
reported in KUDU-3291.

To remediate this, in iterators that merge multiple delta stores,
namely, the DeltaIteratorMerger, a single top-level counter is used to
define the disambiguators generated by sub-iterator. Before iterating
over a new batch of deltas in a given sub-iterator, this counter is
propagated to the sub-iterators as the new starting point of its
counter. The result is that the disambiguators generated by the
DeltaIteratorMerger can be used to define a total ordering of the deltas
selected.

Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
---
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/diff_scan-test.cc
M src/kudu/tablet/tablet-test-base.h
8 files changed, 151 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/17547/2
--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

2021-06-06 Thread Andrew Wong (Code Review)
Andrew Wong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17547


Change subject: KUDU-3291: properly disambiguate between deltas of a row with 
the same timestamp
..

KUDU-3291: properly disambiguate between deltas of a row with the same timestamp

In performing a diff scan, Kudu iterates in small batches of rows,
selecting deltas associated with each row that are relevant to the
scan's timestamp bounds. Once all selected deltas are collected for a
given row, the oldest and newest deltas are found by a sorting criteria
meant to sort deltas by their application order:
1. Deltas of lower timestamps are less than deltas of higher timestmaps.
2. UNDO deltas are less than REDO deltas.
3. Within each delta store's iterator, a counter of selected deltas is
   used to disambiguate between rows that have the same timestamp. A
   critical assumption here is that this disambiguator is only used for
   deltas of the same delta store. For REDOs, a lower counter implies
   a lesser application order -- the opposite is true for UNDOs.

What the above criteria don't account for is the fact that certain
iterators can iterate over separate delta stores have deltas of the same
timestamp and type. If Kudu delta flushes while applying a large batch
of updates to the same row, the result is that some of the batch's
update can land in the newly flushed REDO delta file, while the rest
land in the new DMS. In iterating over these stores with a
DeltaIteratorMerger, which combines the deltas of several delta stores,
this breaks the assumption described above, resulting in the crash
reported in KUDU-3291.

To remediate this, in iterators that merge multiple delta stores,
namely, the DeltaIteratorMerger, a single top-level counter is used to
define the disambiguators generated by sub-iterator. Before iterating
over a new batch of deltas in a given sub-iterator, this counter is
propagated to the sub-iterators as the new starting point of its
counter. The result is that the disambiguators generated by the
DeltaIteratorMerger can be used to define a total ordering of the deltas
selected.

Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
---
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/diff_scan-test.cc
M src/kudu/tablet/tablet-test-base.h
8 files changed, 144 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/17547/1
--
To view, visit http://gerrit.cloudera.org:8080/17547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iccfc518999d36679f85ed901ba65cf7b4894cd55
Gerrit-Change-Number: 17547
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong