Mike Percy has posted comments on this change.

Change subject: KUDU-1601. Delete ancient UNDO delta blocks in the background
......................................................................


Patch Set 12:

(36 comments)

http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

PS9, Line 434: update.RemoveUndoDeltaBlocks(block_ids_to_remove);
> Ah, didn't know that. Think that having that info somewhere would be great.
Done, added a comment to DeleteAncientUndoDeltas()


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

PS12, Line 287: compactions
> compactions and GC now, right?
That's true. Fixed.


PS12, Line 300: No consistency problems will be visible
> not totally your fault, but I don't really understand why this is the case.
Because it's specified to only be used for compactions, not flushes. And that 
is true.


PS12, Line 300: // No consistency problems will be visible if we don't 
successfully
              :     // Flush(), so no need to CHECK_OK here.
> think we need a full explanation why this is the case, here or somewhere el
I added an extra sentence at the end. LMK if you're still not comfortable with 
it.


PS12, Line 367: if (max_deltas_to_initialize != -1 &&
              :           tmp_blocks_initialized == max_deltas_to_initialize) 
break;
> as I mentioned in the header, don't think we need both a time and count lim
Done


PS12, Line 396: effectively a compaction
> i think i understand what you're saying, but this is misleading without add
Updated the comment


PS12, Line 420: -1
> use a constant?
Removed this param


http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

PS9, Line 173: ed.
> because kInvalidTimestamp is a big positive number whereas kInitialTimestam
I look at it in the opposite way, hit me up on Slack if you want to discuss 
this point. A high timestamp is way in the future whereas a low one is way in 
the past.


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

PS12, Line 181: int64_t max_deltas_to_initialize,
              :                         MonoTime deadline,
> do we need both a max_deltas_to_initialize and a dealine? these seem a bit 
You are right... at some point I wanted more control but I don't think it's 
used now.


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/diskrowset-test.cc
File src/kudu/tablet/diskrowset-test.cc:

PS12, Line 564: // Major delta compaction is additive.
> ?
Tried to be terse but I guess it's cryptic. I reworded this.


PS12, Line 567: We should be left with no delta stores.
> nit: There shouldn't be any delta stores left.
Done


PS12, Line 568: constexpr
> curious, why constexpr? any way think it would make sense to have this cons
I am going to remove this parameter also, since it's never used.


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

PS12, Line 545: since we've already updated the metadata
> nit: surround with commas
Done


PS12, Line 556: We don't CHECK_OK on Flush here because if we don't 
successfully flush we
              :   // don't have consistency problems in the case of major delta 
compaction
> as I mentioned elsewhere I think we need a good explanation as to why. I as
I added an explanation right below this line: "we are not adding additional 
mutations that weren't already present"


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/mt-tablet-test.cc
File src/kudu/tablet/mt-tablet-test.cc:

PS12, Line 331:       /*
              :       if (bytes_in_ancient_undos > 0) {
              :         LOG(INFO) << "Found " << bytes_in_ancient_undos << " 
bytes of ancient delta undos";
              :       }
              :       */
> don't think you meant to leave this.
Done


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

PS12, Line 146:   // Initialize up to 'max_deltas_to_initialize' deltas that 
while 'deadline' has
              :   // not yet been passed and while a delta file with a max 
timestamp later than
              :   // 'ancient_history_mark' has not yet been encountered. 
Return the number of
              :   // deltas initialized during this invocation in 
'num_deltas_initialized' and
              :   // the total amount of on-disk data known to be entirely 
composed of ancient
              :   // undo delta blocks in 'bytes_in_ancient_undos'.
> is this comment different from the delta_tracker one? if not maybe make tha
Good idea, done.


PS12, Line 159:   // timestamp lower than the given ancient history mark. This 
method only
              :   // checks the oldest 'max_deltas_to_delete' UNDO delta files, 
and only if
              :   // they are already initialized.
              :   //
              :   // Does not flush updates to the rowset metadata. The caller 
must flush the
              :   // metadata explicitly.
> same
Done


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/rowset_metadata.cc
File src/kudu/tablet/rowset_metadata.cc:

PS12, Line 193:     auto iter = undo_delta_blocks_.begin();
              :     while (iter != undo_delta_blocks_.end()) {
              :       if (ContainsKey(undos_to_remove, *iter)) {
              :         removed.push_back(*iter);
              :         iter = undo_delta_blocks_.erase(iter);
              :       } else {
              :         ++iter;
              :       }
              :     }
> all the blocks we are removing should be present, right? can we check for t
Done


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

PS12, Line 285:   // First, enter the shutdown state to no additional mm jobs 
get started.
              :   {
              :     std::lock_guard<rw_spinlock> lock(component_lock_);
              :     state_ = kShutdown;
              :   }
              : 
              :   // Wait until all external maintenance manager operations 
complete.
              :   LOG_WITH_PREFIX(INFO) << "Waiting until all external 
maintenance ops complete";
              :   {
              :     std::unique_lock<std::mutex> l(external_mm_op_lock_); \
              :     external_mm_ops_finished_.wait(l, [this] { return 
num_external_mm_ops_running_ == 0; });
              :   }
              :   LOG_WITH_PREFIX(INFO) << "External maintenance ops are done";
> this is a bit scary, if not for correctness at least for shutdown ordering 
Yeah, it worked but was not pretty. I was able to remove it.


PS12, Line 1682: Major compacting all delta stores
> append: ", for tests."
Done


PS12, Line 1687: if (!rs->IsAvailableForCompaction()) continue;
> I forget, when would this happen? In any case maybe log it (the test might 
The MRS is never available for compaction so that is the purpose of this. The 
other case is that someone is holding the compact_flush_lock for the RowSet, 
which means they are compacting.

If we log it might be a bit spammy because of the MRS thing.


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

PS12, Line 286: Find and delete all undo delta files that have a maximum op
              :   // timestamp prior to the current ancient history mark.
> xs nit: weird wrapping relative to the paragraph below
Done


PS12, Line 534:   // Lock to protect 'num_external_mm_ops_running_' and the 
associated
              :   // condition variable 'external_mm_ops_finished_'. No other 
Tablet locks
              :   // should be acquired while holding this lock; This lock 
should be acquired
              :   // last.
              :   mutable std::mutex external_mm_op_lock_;
              : 
              :   // Indicates how many externally-driven mm ops are running. 
Blocks shutdown
              :   // when greater than zero.
              :   int32_t num_external_mm_ops_running_;
              :   mutable std::condition_variable external_mm_ops_finished_;
> think with the simplifications we discussed yesterday (IO on Perform() low 
Yep, done


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet_history_gc-test.cc
File src/kudu/tablet/tablet_history_gc-test.cc:

PS12, Line 550: // No deletes.
> why is this relevant?
Hmm it's not, removed


PS12, Line 559: MonoTime deadline = MonoTime::Now() + 
MonoDelta::FromSeconds(60);
> use MonoTime::Max() here
Changed the method signature to a timeout, changed this to an initialized 
MonoDelta.


PS12, Line 560: bytes_in_ancient_undos
> assert that this is GT 0
Done


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet_metrics.cc
File src/kudu/tablet/tablet_metrics.cc:

PS12, Line 119: Does not include bytes garbage collected during compactions.
> why? think it would make sense to have both, no?
I think it would be nice to have a separate metric for that, but I don't think 
it needs to be part of this patch.


PS12, Line 126: Does not include blocks garbage collected during compactions.
> not sure it's worth it to have both metrics. what does the number of blocks
Yeah that's fair, I don't think people will really care about this metric. 
Removing it.


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS12, Line 1131: WARN_NOT_OK(server_->tablet_manager()->GetTabletPeers(&peers), 
"Cannot list tablets"
> return on not OK?
Reverted this change to the method signature


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS12, Line 842: WARN_NOT_OK(GetTabletPeers(&peers_to_shutdown), "Cannot list 
tablets");
> should this be a warn? what happens if we don't get all of them? from what 
Reverted this


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

PS12, Line 190: fs_manager
> docs
Reverted this


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tserver-path-handlers.cc
File src/kudu/tserver/tserver-path-handlers.cc:

PS12, Line 114: WARN_NOT_OK(tserver_->tablet_manager()->GetTabletPeers(&peers), 
"Cannot list tablets");
> what happens below on not-OK?
reverted


PS12, Line 182: WARN_NOT_OK(tserver_->tablet_manager()->GetTabletPeers(&peers), 
"Cannot list tablets");
> same
reverted


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tserver_mm_ops.cc
File src/kudu/tserver/tserver_mm_ops.cc:

PS12, Line 59: Total Undo Delta Block GC Init Duration
> think the explanation text below is better than the metric name and not muc
I ended up removing these "total" metrics since there wasn't an analog for them 
elsewhere.


PS12, Line 108: enable_undo_delta_block_gc
> same comment as before regarding merging the enable_ flag with the undo max
Since it's a new feature, it seems safer to have a flag to turn off undo delta 
block gc independently of changing the ancient history mark.

Or do you mean --data_gc_min_size_mb ?


http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tserver_mm_ops.h
File src/kudu/tserver/tserver_mm_ops.h:

PS12, Line 50: class UndoDeltaBlockGCOp : public MaintenanceOp
> likely can make this a regular tablet op with the changes we discussed.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0309bf7acfb6d018860c80f354012c3500da5c68
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to