Ashwani Raina has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20546 )

Change subject: [compaction] Skip memory allocation for ancient undo deltas
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20546/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20546/1//COMMIT_MSG@7
PS1, Line 7: [compaction] Skip memory allocation for ancient undo deltas
> From what I can see, the TabletHistoryGcTest.TestNoGenerateUndoOnMajorDelta
There is no update in functionality here. The only difference is that we take 
note of "ancient" REDO mutations quite early in the processing rather than in 
later part of compaction.

Without this change, compaction does take into consideration all the ancient 
REDO mutations and allocates memory for their corresponding UNDO deltas only to 
ignore those in the later stage of processing.

With this change, compaction doesn't take into consideration all the ancient 
REDO mutations from get go and hence no memory is allocated for converting 
those mutations to UNDO deltas.

So, the behaviour remains the same i.e. after flushing updates on a row, the 
mutations that have become ancient, should have no corresponding UNDO mutations 
present.

I am not quite clear on what outcome you are expecting from the path mentioned 
above.

Considering above scenario, here are my thoughts.
When a lot of REDO deltas are generated, those can be converted into UNDO 
deltas via either MajorDeltaCompaction or RowSet Compaction while those REDO 
deltas are not ancient. However, if there is a change in AHM, to say 1 sec 
after a restart, and a compaction maintenance op kicks in afterwards, it is 
going to process those REDO deltas and treat them as ancient. As those are 
ancient, compaction will not have any UNDO delta for such REDO mutations. 
Eventually, what we will be left with is the list of only those mutations that 
are not ancient. Even if there are some REDO deltas that were minor-compacted, 
it won't matter because those will have ancient timestamp on resultant REDO 
delta as well.

The newly added test does verify by ensuring that after rowset compaction is 
complete, any history scan (beyond AHM) should yield the latest row data value 
i.e. 2 because all the REDO mutations have been applied and there is no 
corresponding UNDO delta for those (that represent first INSERT and subsequent 
UPDATE to value 2). Similarly, if the same set of history scan is done before 
compaction was invoked, it gives the old mutations i.e. row value as 0, 1.

These test essentially ensure that there are no change in functionalities and 
expectations as far as handling of ancient REDO mutation goes. One test already 
ensures that by verifying for MajorDeltaCompaction. The new one that I added 
ensures the same for rowset compaction.

As part of manual unit testing, I had added some stats/counter to measure the 
number of times memory allocation was skipped. This was more of a manual 
verification. Maybe I can add that metric and verify at the end of the test.



--
To view, visit http://gerrit.cloudera.org:8080/20546
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb41636a3ac063478fe181560d4ec85dadeb0ef3
Gerrit-Change-Number: 20546
Gerrit-PatchSet: 1
Gerrit-Owner: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Zoltan Martonka <[email protected]>
Gerrit-Comment-Date: Sun, 22 Oct 2023 20:27:50 +0000
Gerrit-HasComments: Yes

Reply via email to