Ashwani Raina has posted comments on this change. ( http://gerrit.cloudera.org:8080/20546 )
Change subject: [compaction] Skip memory allocation for ancient undo deltas ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20546/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20546/1//COMMIT_MSG@7 PS1, Line 7: [compaction] Skip memory allocation for ancient undo deltas > From what I can see, the TabletHistoryGcTest.TestNoGenerateUndoOnMajorDelta There is no update in functionality here. The only difference is that we take note of "ancient" REDO mutations quite early in the processing rather than in later part of compaction. Without this change, compaction does take into consideration all the ancient REDO mutations and allocates memory for their corresponding UNDO deltas only to ignore those in the later stage of processing. With this change, compaction doesn't take into consideration all the ancient REDO mutations from get go and hence no memory is allocated for converting those mutations to UNDO deltas. So, the behaviour remains the same i.e. after flushing updates on a row, the mutations that have become ancient, should have no corresponding UNDO mutations present. I am not quite clear on what outcome you are expecting from the path mentioned above. Considering above scenario, here are my thoughts. When a lot of REDO deltas are generated, those can be converted into UNDO deltas via either MajorDeltaCompaction or RowSet Compaction while those REDO deltas are not ancient. However, if there is a change in AHM, to say 1 sec after a restart, and a compaction maintenance op kicks in afterwards, it is going to process those REDO deltas and treat them as ancient. As those are ancient, compaction will not have any UNDO delta for such REDO mutations. Eventually, what we will be left with is the list of only those mutations that are not ancient. Even if there are some REDO deltas that were minor-compacted, it won't matter because those will have ancient timestamp on resultant REDO delta as well. The newly added test does verify by ensuring that after rowset compaction is complete, any history scan (beyond AHM) should yield the latest row data value i.e. 2 because all the REDO mutations have been applied and there is no corresponding UNDO delta for those (that represent first INSERT and subsequent UPDATE to value 2). Similarly, if the same set of history scan is done before compaction was invoked, it gives the old mutations i.e. row value as 0, 1. These test essentially ensure that there are no change in functionalities and expectations as far as handling of ancient REDO mutation goes. One test already ensures that by verifying for MajorDeltaCompaction. The new one that I added ensures the same for rowset compaction. As part of manual unit testing, I had added some stats/counter to measure the number of times memory allocation was skipped. This was more of a manual verification. Maybe I can add that metric and verify at the end of the test. -- To view, visit http://gerrit.cloudera.org:8080/20546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb41636a3ac063478fe181560d4ec85dadeb0ef3 Gerrit-Change-Number: 20546 Gerrit-PatchSet: 1 Gerrit-Owner: Ashwani Raina <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Sun, 22 Oct 2023 20:27:50 +0000 Gerrit-HasComments: Yes
