David Ribeiro Alves has posted comments on this change. Change subject: KUDU-1601. Delete ancient UNDO delta blocks in the background ......................................................................
Patch Set 13: (11 comments) http://gerrit.cloudera.org:8080/#/c/4363/13/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: PS13, Line 136: enable_undo_delta_block_gc ah, I missed this flag lived here (I'm reviewing in reverse file order) yeah maybe it makes more sense to have a separate flag just for this. However if it's only pertaining to the background task (which i still find a bit weird) why not move it there? (i.e. the flag turns all undo gc ops to non-runnable) PS13, Line 1764: InitAncientUndoDeltas why are you returning bytes_in_ancient_undos? the op seems to ignore this. PS13, Line 1774: // No need to hold the selection lock here, since we are not actually making : // any changes. nit: this is slightly confusing cuz we do get a read lock to copy them, I think you could remove the comment altogether (the method above doesn't have it, for once) PS13, Line 1782: vector<pair<size_t, int64_t>> rowset_ancient_undos_est_sizes; // index, bytes : rowset_ancient_undos_est_sizes.reserve(rowsets.size()); : for (size_t i = 0; i < rowsets.size(); i++) { : const auto& rowset = rowsets[i]; : int64_t bytes; : RETURN_NOT_OK(rowset->EstimateBytesInPotentiallyAncientUndoDeltas(ancient_history_mark, : &bytes)); : rowset_ancient_undos_est_sizes.emplace_back(i, bytes); : } are adding already intialized undo deltas part of this? why? eventually this method shouldn't do anything, right? PS13, Line 1793: td::sort(rowset_ancient_undos_est_sizes.begin(), rowset_ancient_undos_est_sizes.end(), : [&](pair<size_t, int64_t>& a, pair<size_t, int64_t>& b) { : return a.second > b.second; // Descending order. : }); if we only had the uninitialized undos in rowset_ancient_undos_est_sizes would this still make sense? (just trying to simplify really) PS13, Line 1834: std::lock_guard<std::mutex> compact_lock(compact_select_lock_); we hold this lock for the whole time? all other uses of this lock in tablet.cc are just to read/get a snapshot of rowsets. we might do IO on delete right? don't we need to follow a strategy like in Tablet::PickRowSetsToCompact() http://gerrit.cloudera.org:8080/#/c/4363/13/src/kudu/tablet/tablet_mm_ops.cc File src/kudu/tablet/tablet_mm_ops.cc: PS13, Line 276: stats->set_runnable(true); set runnable to false if there no data to gc PS13, Line 283: WARN_NOT_OK(tablet_->InitAncientUndoDeltas(time_budget, &tablet_bytes_in_ancient_undos), : Substitute("$0Unable to initialize old undo delta files on tablet $1", : LogPrefix(), tablet_->tablet_id())); is tablet_bytes_in_ancient_undos defined if the method returns not ok? seems like we should RETURN_NOT_OK here and skip deleting anything http://gerrit.cloudera.org:8080/#/c/4363/13/src/kudu/util/maintenance_manager.cc File src/kudu/util/maintenance_manager.cc: PS13, Line 69: DEFINE_int64(data_gc_min_size_mb, 0, : "The (exclusive) minimum number of megabytes of ancient data on " : "disk, per tablet, needed to prioritize deletion of that data."); do we actually need this knob? when would we turn this up or down? if you want to keep it it does seem like it should be higher. (i.e. we don't want to skip a compaction to recover a couple of bytes) Finally maybe use a special value to mean "never gc" like -1 or something so that we can turn off gc if needed. PS13, Line 324: op->io_usage() == MaintenanceOp::LOW_IO_USAGE remind me what this is doing? PS13, Line 395: rand_.NextDoubleFraction() > FLAGS_data_gc_prioritization_prob nit: can you revert this (<= not >) and flip the if? -- To view, visit http://gerrit.cloudera.org:8080/4363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0309bf7acfb6d018860c80f354012c3500da5c68 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
