Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15145 )

Change subject: KUDU-1625: background op to GC ancient, fully deleted rowsets
......................................................................


Patch Set 5:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/delta_tracker.cc@468
PS4, Line 468: }
> Curious why you're doing this outside the lock. Seems relatively cheap?
Mostly just good hygiene, but I'll update it if you think it's going overboard.


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/diskrowset.cc@884
PS4, Line 884:   // NOTE: this estimate might not read from disk and may thus 
return false
> Probably worth commenting here precisely why the semantics of EstimateAllRe
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mock-rowsets.h
File src/kudu/tablet/mock-rowsets.h:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mock-rowsets.h@96
PS4, Line 96:   virtual uint64_t OnDiskBaseDataColumnSize(const ColumnId& 
/*col_id*/) const override {
> warning: parameter 'col_id' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mock-rowsets.h@106
PS4, Line 106:     return nullptr;
> warning: use nullptr [modernize-use-nullptr]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mock-rowsets.h@116
PS4, Line 116:     return nullptr;
> warning: use nullptr [modernize-use-nullptr]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mock-rowsets.h@209
PS4, Line 209:   virtual uint64_t OnDiskBaseDataColumnSize(const ColumnId& 
/*col_id*/) const override {
> warning: parameter 'col_id' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mock-rowsets.h@233
PS4, Line 233:   virtual Status GetBounds(std::string* /*min_encoded_key*/,
> warning: parameter 'min_encoded_key' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mock-rowsets.h@234
PS4, Line 234:                            std::string* /*max_encoded_key*/) 
const override {
> warning: parameter 'max_encoded_key' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mt-tablet-test.cc
File src/kudu/tablet/mt-tablet-test.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mt-tablet-test.cc@315
PS4, Line 315:   void FlushDeltasThread(int /*tid*/) {
> warning: parameter 'tid' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mt-tablet-test.cc@342
PS4, Line 342:   void CompactThread(int /*tid*/) {
> warning: parameter 'tid' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mt-tablet-test.cc@374
PS4, Line 374:     do {
> Might want to structure this loop as the other threads do, because otherwis
I just made this a do-while instead.


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mt-tablet-test.cc@519
PS4, Line 519:   FLAGS_tablet_delta_store_minor_compact_max = 10;
> Was this a mistake?
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/mt-tablet-test.cc@562
PS4, Line 562:
> warning: floating point literal has suffix 'f', which is not uppercase [rea
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/rowset.h@222
PS4, Line 222: no live
> no live rows?
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet-test-base.h
File src/kudu/tablet/tablet-test-base.h:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet-test-base.h@342
PS4, Line 342:   // Deletes 'count' rows, starting with 'first_row'.
> Short doc?
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.h@342
PS4, Line 342:   // Returns the number of bytes potentially used by rowsets 
that have no live
> Nit: seems like a rather abrupt intro to the function. Maybe start with a b
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.h@347
PS4, Line 347:   // the latest update is not considered ancient. If there is no 
DMS, looks at
> warning: function 'kudu::tablet::Tablet::GetBytesInAncientDeletedRowsets' h
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.h@350
PS4, Line 350:
             :   Status GetBytesInAncientDelete
> Not clear what this means.
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc@2319
PS4, Line 2319: Status Tablet::GetBytesInAncientDeletedRowsets(int64_t* 
bytes_in_ancient_deleted_rowsets) {
> Should we at least set the number of bytes to 0 on this return path?
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc@2328
PS4, Line 2328:   scoped_refptr<TabletComponents> comps;
              :   GetComponents(&comps);
> Why not a simple for loop on rowsets?
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc@2345
PS4, Line 2345:   *bytes_in_ancient_deleted_rowsets = bytes;
> No VLOG in this case?
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc@2350
PS4, Line 2350:
> Nit: ambiguity here, is 'them' referring to the rowsets, or the locks? I kn
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc@2364
PS4, Line 2364:   RowSetVector to_delete;
> This involves CountLiveRows, which can fail on older tablets. But IIRC, the
Good call. It seems like Tablet::CountLiveRows() is conditioned by 
metadata_->supports_live_row_count(), and thus the DRS calls are all gated on 
that. I'll not register this op if the live row count isn't supported by the 
tablet. IIRC, live row count support can't change.


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc@2368
PS4, Line 2368:   {
> Should we CHECK that we got the lock? That's guaranteed, right?
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet.cc@2381
PS4, Line 2381:         // compactions don't try to select it for compactions.
> Have you had a chance to try this on a real cluster? I'm curious what you'v
Yeah, mostly under 100ms for a few million rows deleted at a time.


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_history_gc-test.cc
File src/kudu/tablet/tablet_history_gc-test.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_history_gc-test.cc@125
PS4, Line 125:   static TestRowVerifier GenRowsEqualVerifier(int32_t 
expected_val) {
> warning: method 'GenRowsEqualVerifier' can be made static [readability-conv
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_history_gc-test.cc@126
PS4, Line 126:     return [=](int32_t /*key*/, int32_t val) -> bool { return 
val == expected_val; };
> warning: parameter 'key' is unused [misc-unused-parameters]
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_history_gc-test.cc@706
PS4, Line 706:   NO_FATALS(InsertOriginalRows(kNumRowsets, rows_per_rowset_));
> Would be clearer if you used DeleteOriginalRows for this too, and parameter
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_metrics.cc
File src/kudu/tablet/tablet_metrics.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_metrics.cc@259
PS4, Line 259:   "Deleted Rowset GC Running",
> Here and elsewhere you're still referring to these things as "empty" instea
Done


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_mm_ops.cc
File src/kudu/tablet/tablet_mm_ops.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_mm_ops.cc@86
PS4, Line 86:
> The other tags I understand, but why this one?
Fair. Copied the others, but it's pretty reasonable to switch this off and 
consider it to be safe.


http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_replica-test.cc
File src/kudu/tablet/tablet_replica-test.cc:

http://gerrit.cloudera.org:8080/#/c/15145/4/src/kudu/tablet/tablet_replica-test.cc@815
PS4, Line 815:   ASSERT_OK(ExecuteDeletesAndRollLogs(kNumRows));
> Test here that live_row_count is 0?
Done



--
To view, visit http://gerrit.cloudera.org:8080/15145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I696e2a29ea52ad4e54801b495c322bc371787124
Gerrit-Change-Number: 15145
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Volodymyr Verovkin <[email protected]>
Gerrit-Comment-Date: Thu, 19 Mar 2020 22:42:53 +0000
Gerrit-HasComments: Yes

Reply via email to