KeDeng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18503 )
Change subject: KUDU-3367 [compaction] add supplement to gc algorithm ...................................................................... Patch Set 31: (22 comments) Thanks for your reviews. http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@9 PS31, Line 9: delta with full of delete op > ... delta full of delete ops ... Done http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@10 PS31, Line 10: no update op else in the file > ... not a single update operation in the delta ... Done http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@10 PS31, Line 10: The current compact algorithm will : not schedule the file do compact > The current compaction algorithm doesn't run GC on such deltamemstores. Done http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@11 PS31, Line 11: If such files exist, after : accumulating for a period of time, it will greatly affect our scan : speed. > The accumulation of deltamemstores like that negatively affects performance Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.h@399 PS31, Line 399: need release storage > need to release storage Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.h@400 PS31, Line 400: live count > nit: live row count Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@64 PS31, Line 64: DEFINE_uint64(all_delete_op_delta_file_cnt_for_compact, 1, : "The compaction scheduler will not deal the compact delta files " : "with only delete ops until the file count threshold reached. " > What sort of files do we count here? Delta memstores? Would be great to b It's redo delta store files here, I've updated the comments. In addition, the numeric threshold is used here because we may need to control the number of redo files in the future, and the boolean type cannot achieve this effect http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@67 PS31, Line 67: FLAGS_tablet_history_max_age_sec > nit: --tablet_history_max_age_sec Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@954 PS31, Line 954: // Get all delete op only delta file count. : // Ignore only one delete op for the delta file. > Count the number of REDO delta stores that contain only DELETE operations w Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@966 PS31, Line 966: kOutdatedInterval > nit: why not to call this just 'max_delta_age'? Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@967 PS31, Line 967: > style nit: wrong indent (see https://google.github.io/styleguide/cppguide.h Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@974 PS31, Line 974: > nit: assignment Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test-base.h File src/kudu/tablet/diskrowset-test-base.h: http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test-base.h@155 PS31, Line 155: Delete some data at a given rate from the given rowset. > Delete rows in the specified range from the given rowset. Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc File src/kudu/tablet/diskrowset-test.cc: http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@647 PS31, Line 647: AffectCompaction > Would be great to have more meaningful name here, or maybe add a comment ex Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@648 PS31, Line 648: With this setting, we want major compactions to basically always have a score. > Make major delta compaction runnable even with tiny amount of data accumula Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@650 PS31, Line 650: FLAGS_rowset_metadata_store_keys = true; > Is this essential to have set flag set to 'true'? If yes, please add a com Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@657 PS31, Line 657: Get > Generate Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@671 PS31, Line 671: Get > Generate Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@671 PS31, Line 671: deleted data > DELETE operations Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@676 PS31, Line 676: + > style nit: add spaces around '+' sign Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@681 PS31, Line 681: FLAGS_tablet_history_max_age_sec = 1; > Please add a comment what's the intention behind setting this parameter so Done http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@684 PS31, Line 684: Compaction of the delete op only delta files will be scheduled > Major delta compaction of the DRS should obtain a non-zero score. Done -- To view, visit http://gerrit.cloudera.org:8080/18503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b26737dffecc17688b42188da959b2ba16351ed Gerrit-Change-Number: 18503 Gerrit-PatchSet: 31 Gerrit-Owner: KeDeng <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: KeDeng <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Comment-Date: Mon, 12 Dec 2022 02:40:05 +0000 Gerrit-HasComments: Yes
