David Ribeiro Alves has posted comments on this change. Change subject: KUDU-1601. Delete ancient UNDO delta blocks in the background ......................................................................
Patch Set 12: (23 comments) http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: PS9, Line 434: update.RemoveUndoDeltaBlocks(block_ids_to_remove); > The blocks don't actually get deleted until we flush the metadata, since th Ah, didn't know that. Think that having that info somewhere would be great. Maybe a header comment on the delta tracker class? http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: PS12, Line 287: compactions compactions and GC now, right? PS12, Line 300: No consistency problems will be visible not totally your fault, but I don't really understand why this is the case. how about: "It's ok If we don't successfully flush here (why?)" PS12, Line 300: // No consistency problems will be visible if we don't successfully : // Flush(), so no need to CHECK_OK here. think we need a full explanation why this is the case, here or somewhere else. In particular because you changed a log fatal into a return not ok PS12, Line 367: if (max_deltas_to_initialize != -1 && : tmp_blocks_initialized == max_deltas_to_initialize) break; as I mentioned in the header, don't think we need both a time and count limits. if I had to choose one I'd go with time PS12, Line 396: effectively a compaction i think i understand what you're saying, but this is misleading without additional info PS12, Line 420: -1 use a constant? http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: PS9, Line 173: ed. > Why do you think that helps with comparison mistakes? If we say invalid == because kInvalidTimestamp is a big positive number whereas kInitialTimestamp is a big negative number. If we wrongfully used the timestamp in the comparison for kInvalidTimestamp we'd delete all the delta blocks, vs for kInitialTimestamp we wouldn't delete any. Not a big deal though http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: PS12, Line 181: int64_t max_deltas_to_initialize, : MonoTime deadline, do we need both a max_deltas_to_initialize and a dealine? these seem a bit redundant. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/diskrowset-test.cc File src/kudu/tablet/diskrowset-test.cc: PS12, Line 564: // Major delta compaction is additive. ? PS12, Line 567: We should be left with no delta stores. nit: There shouldn't be any delta stores left. PS12, Line 568: constexpr curious, why constexpr? any way think it would make sense to have this constant live in the delta tracker or close by http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: PS12, Line 545: since we've already updated the metadata nit: surround with commas PS12, Line 556: We don't CHECK_OK on Flush here because if we don't successfully flush we : // don't have consistency problems in the case of major delta compaction as I mentioned elsewhere I think we need a good explanation as to why. I assume you thought about it and know it, so it would be great to not lose that. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/mt-tablet-test.cc File src/kudu/tablet/mt-tablet-test.cc: PS12, Line 331: /* : if (bytes_in_ancient_undos > 0) { : LOG(INFO) << "Found " << bytes_in_ancient_undos << " bytes of ancient delta undos"; : } : */ don't think you meant to leave this. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/rowset.h File src/kudu/tablet/rowset.h: PS12, Line 146: // Initialize up to 'max_deltas_to_initialize' deltas that while 'deadline' has : // not yet been passed and while a delta file with a max timestamp later than : // 'ancient_history_mark' has not yet been encountered. Return the number of : // deltas initialized during this invocation in 'num_deltas_initialized' and : // the total amount of on-disk data known to be entirely composed of ancient : // undo delta blocks in 'bytes_in_ancient_undos'. is this comment different from the delta_tracker one? if not maybe make that one point here? PS12, Line 159: // timestamp lower than the given ancient history mark. This method only : // checks the oldest 'max_deltas_to_delete' UNDO delta files, and only if : // they are already initialized. : // : // Does not flush updates to the rowset metadata. The caller must flush the : // metadata explicitly. same http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/rowset_metadata.cc File src/kudu/tablet/rowset_metadata.cc: PS12, Line 193: auto iter = undo_delta_blocks_.begin(); : while (iter != undo_delta_blocks_.end()) { : if (ContainsKey(undos_to_remove, *iter)) { : removed.push_back(*iter); : iter = undo_delta_blocks_.erase(iter); : } else { : ++iter; : } : } all the blocks we are removing should be present, right? can we check for that? http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: PS12, Line 285: // First, enter the shutdown state to no additional mm jobs get started. : { : std::lock_guard<rw_spinlock> lock(component_lock_); : state_ = kShutdown; : } : : // Wait until all external maintenance manager operations complete. : LOG_WITH_PREFIX(INFO) << "Waiting until all external maintenance ops complete"; : { : std::unique_lock<std::mutex> l(external_mm_op_lock_); \ : external_mm_ops_finished_.wait(l, [this] { return num_external_mm_ops_running_ == 0; }); : } : LOG_WITH_PREFIX(INFO) << "External maintenance ops are done"; this is a bit scary, if not for correctness at least for shutdown ordering issues PS12, Line 1682: Major compacting all delta stores append: ", for tests." PS12, Line 1687: if (!rs->IsAvailableForCompaction()) continue; I forget, when would this happen? In any case maybe log it (the test might be expecting that all are compacted and this way it has hint why that's not the case) http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: PS12, Line 286: Find and delete all undo delta files that have a maximum op : // timestamp prior to the current ancient history mark. xs nit: weird wrapping relative to the paragraph below PS12, Line 534: // Lock to protect 'num_external_mm_ops_running_' and the associated : // condition variable 'external_mm_ops_finished_'. No other Tablet locks : // should be acquired while holding this lock; This lock should be acquired : // last. : mutable std::mutex external_mm_op_lock_; : : // Indicates how many externally-driven mm ops are running. Blocks shutdown : // when greater than zero. : int32_t num_external_mm_ops_running_; : mutable std::condition_variable external_mm_ops_finished_; think with the simplifications we discussed yesterday (IO on Perform() low probability to run on start) we could get away with having a normal tablet op and avoiding the additional synch infra -- To view, visit http://gerrit.cloudera.org:8080/4363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0309bf7acfb6d018860c80f354012c3500da5c68 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
