Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18503 )

Change subject: KUDU-3367 [compaction] add supplement to gc algorithm
......................................................................


Patch Set 31:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@9
PS31, Line 9: delta with full of delete op
... delta full of delete ops ...


http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@10
PS31, Line 10: no update op else in the file
... not a single update operation in the delta ...


http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@10
PS31, Line 10: The current compact algorithm will
             : not schedule the file do compact
The current compaction algorithm doesn't run GC on such deltamemstores.


http://gerrit.cloudera.org:8080/#/c/18503/31//COMMIT_MSG@11
PS31, Line 11: If such files exist, after
             : accumulating for a period of time, it will greatly affect our 
scan
             : speed.
The accumulation of deltamemstores like that negatively affects performance of 
scan operations.


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.h@399
PS31, Line 399: need release storage
need to release storage


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@64
PS31, Line 64: DEFINE_uint64(all_delete_op_delta_file_cnt_for_compact, 1,
             :               "The compaction scheduler will not deal the 
compact delta files "
             :               "with only delete ops until the file count 
threshold reached. "
What sort of files do we count here?  Delta memstores?  Would be great to be a 
bit more precise.

Also, why it's important to have this as some sort sort of numeric threshold 
instead of just a boolean?


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@954
PS31, Line 954:     // Get all delete op only delta file count.
              :     // Ignore only one delete op for the delta file.
Count the number of REDO delta stores that contain only DELETE operations where 
more than a single DELETE operation is present.


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@966
PS31, Line 966: kOutdatedInterval
nit: why not to call this just 'max_delta_age'?


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/delta_tracker.cc@967
PS31, Line 967:
style nit: wrong indent (see 
https://google.github.io/styleguide/cppguide.html#Function_Calls for acceptable 
options)


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test-base.h
File src/kudu/tablet/diskrowset-test-base.h:

http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test-base.h@155
PS31, Line 155: Delete some data at a given rate from the given rowset.
Delete rows in the specified range from the given rowset.


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc
File src/kudu/tablet/diskrowset-test.cc:

http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@647
PS31, Line 647: AffectCompaction
Would be great to have more meaningful name here, or maybe add a comment 
explaining what this test scenario tries to verify.


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@648
PS31, Line 648: With this setting, we want major compactions to basically 
always have a score.
Make major delta compaction runnable even with tiny amount of data accumulated 
across rowset's deltas.


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@650
PS31, Line 650: FLAGS_rowset_metadata_store_keys = true;
Is this essential to have set flag set to 'true'?  If yes, please add a comment 
explaining the reasoning; if not, remove this custom setting.


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@657
PS31, Line 657: Get
Generate


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@671
PS31, Line 671: deleted data
DELETE operations


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@671
PS31, Line 671: Get
Generate


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@676
PS31, Line 676: +
style nit: add spaces around '+' sign


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@681
PS31, Line 681: FLAGS_tablet_history_max_age_sec = 1;
Please add a comment what's the intention behind setting this parameter so low.


http://gerrit.cloudera.org:8080/#/c/18503/31/src/kudu/tablet/diskrowset-test.cc@684
PS31, Line 684: Compaction of the delete op only delta files will be scheduled
Major delta compaction of the DRS should obtain a non-zero score.



--
To view, visit http://gerrit.cloudera.org:8080/18503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b26737dffecc17688b42188da959b2ba16351ed
Gerrit-Change-Number: 18503
Gerrit-PatchSet: 31
Gerrit-Owner: KeDeng <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: KeDeng <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Comment-Date: Sun, 11 Dec 2022 00:38:55 +0000
Gerrit-HasComments: Yes

Reply via email to