David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1601. Delete ancient UNDO delta blocks in the background
......................................................................


Patch Set 9:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tserver/tserver_mm_ops.cc
File src/kudu/tserver/tserver_mm_ops.cc:

PS9, Line 34: .
maybe add "(doesn't prevent compactions from deleting ancient data)"


PS9, Line 37: DEFINE_int32(undo_delta_block_gc_init_budget_millis, 100,
this should likely be tagged as advanced, maybe even hidden.


PS9, Line 43: DEFINE_double(undo_delta_block_gc_delete_max_selection_ratio, 0.5,
            :     "The maximum fraction of the time that undo delta block gc 
may be selected "
            :     "to run by the maintenance manager. This is measured by the 
ratio of number "
            :     "of times UpdateStats() is called over the number of times 
Perform() is "
            :     "called on the UndoDeltaBlockGCOp. This is intended to avoid 
that task "
            :     "starving other maintenance tasks.");
tags (hidden, likely). this is hard to understand as a prioritization cap.


http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tserver/tserver_mm_ops.h
File src/kudu/tserver/tserver_mm_ops.h:

PS9, Line 1: // Licensed to the Apache Software Foundation (ASF) under one
Not sure I got the complete picture of all the discussions and the controversy 
around the server level mm. I think it's a great idea to have one but I think 
that MaintenanceMananger::FindBestOp() is way too tablet specific for the 
current implementation to make any sense in this case.
In specific I think it's forcing you to expose flags that we don't really want 
if we had a server level implementation of that maintenance manager method.


PS9, Line 55: OVERRIDE
use c++11 overrides


PS9, Line 65: LogPrefix
docs


http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/util/maintenance_manager.cc
File src/kudu/util/maintenance_manager.cc:

PS9, Line 69: The target minimum size (in megabytes), per tablet, of ancient "
            :              "data on disk needed to prioritize deletion of that 
ancient data.
this sentence is very hard to parse. also why do we need a min? if we have a 
single gc'able undo delta shouldn't we always gc it, even if at a lower 
priority? in other words shouldn't the scheduler prioritize gc from something 
like 0 (no data to gc) to 1 (a lot of data to gc/disk is full) in a continuous 
spectrum?


http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/util/maintenance_manager.h
File src/kudu/util/maintenance_manager.h:

PS9, Line 118: Important: Update Clear() when adding fields to this class.
nice add, but could you move it to the class header?


PS9, Line 135: The approximate amount of disk space that not doing this 
operation keeps us from GCing from
             :   // the data blocks.
I couldn't parse this, I know its a parallel comment to the fields above, but 
those are pretty unparseable too. How about: "Approximate amount of data that 
would be GCd if this operation ran"


-- 
To view, visit http://gerrit.cloudera.org:8080/4363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0309bf7acfb6d018860c80f354012c3500da5c68
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to