David Ribeiro Alves has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11395 )

Change subject: KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator
......................................................................


Patch Set 8:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/11395/7/src/kudu/tablet/delta_key.h
File src/kudu/tablet/delta_key.h:

http://gerrit.cloudera.org:8080/#/c/11395/7/src/kudu/tablet/delta_key.h@51
PS7, Line 51: DeltaTypeSelector
docs


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.h@222
PS8, Line 222: DMSPreparerTraits
docs


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.h@271
PS8, Line 271: to 'key.row_idx()' are irrelevant
... under a snapshot?


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.h@290
PS8, Line 290: last_added_idx
hum, idx of what? docs would help


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.h@301
PS8, Line 301:   void MaybeProcessPreviousRowChange(boost::optional<rowid_t> 
cur_row_idx);
hum this method seems weird and sketchy. when isn't there a new row index and 
why are we updating previous values.
maybe it will make more sense in the cc file


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.cc
File src/kudu/tablet/delta_store.cc:

http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.cc@238
PS8, Line 238:   for (const auto& row_id : deleted_) {
             :     uint32_t idx_in_block = row_id - prev_prepared_idx_;
             :     sel_vec->SetRowUnselected(idx_in_block);
             :   }
             :
             :   for (const auto& row_id : reinserted_) {
             :     uint32_t idx_in_block = row_id - prev_prepared_idx_;
             :     sel_vec->SetRowSelected(idx_in_block);
             :   }
is this still correct if the same row has both deletes and reinserts? for 
instance if the sequence of deltas is delete->reinsert->delete wouldn't this 
still mark it as selected?


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.cc@342
PS8, Line 342:   // We can't use RowChangeListDecoder.TwiddleDeleteStatus 
because:
             :   // 1. Our deletion status includes an additional NONE state.
             :   // 2. The logical chain of DELETEs and REINSERTs for a given 
row may extend
             :   //    across DeltaPreparer instances. For example, the same 
row may be deleted
             :   //    in one delta file and reinserted in the next. But, 
because
             :   //    DeltaPreparers cannot exchange this information in the 
context of batch
             :   //    preparation, we have to allow any state transition from 
NONE.
             :   //
             :   // DELETE+REINSERT pairs are reset back to NONE: these rows 
were both deleted
             :   // and reinserted in the same batch, so their states haven't 
actually changed.
this makes sense but I think this patch needs to come with an extensive 
dist-test run of fuzz test, particularly one that has many deltas on the same 
row


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/delta_store.cc@358
PS8, Line 358: NONE
maybe UNKNOWN would be a better name?


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/deltafile.cc
File src/kudu/tablet/deltafile.cc:

http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/deltafile.cc@588
PS8, Line 588:   prepared_ = true;
nit move this to the bottom?


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/deltafile.cc@650
PS8, Line 650:       // Note: if finished_row is true, we could skip the 
remaining deltas for
             :       // this row by seeking the block decoder. This trades off 
the cost of a
             :       // seek against the cost of decoding some irrelevant delta 
keys.
             :       //
             :       // Given that updates are expected to be uncommon and that 
most scans are
             :       // _not_ historical, the current implementation eschews 
seeking in favor of
             :       // skipping irrelevant deltas one by one.
kinda weird this commet is here. move it to where finished_row is and maybe 
rename finished_row to finished_row_unused or something (I was looking for 
where it was used until I found this)


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/deltamemstore-test.cc
File src/kudu/tablet/deltamemstore-test.cc:

http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/deltamemstore-test.cc@73
PS8, Line 73: #ifdef NDEBUG
            :              100,
            : #else
            :              1,
            : #endif
this is not how we normally do this, is it?


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/deltamemstore.cc
File src/kudu/tablet/deltamemstore.cc:

http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/deltamemstore.cc@264
PS8, Line 264:     // Note: if finished_row is true, we could skip the 
remaining deltas for
             :     // this row by seeking the tree iterator. This trades off 
the cost of a seek
             :     // against the cost of decoding some irrelevant delta keys. 
Experimentation
             :     // with a microbenchmark revealed that only when ~50 deltas 
were skipped was
             :     // the seek cheaper than the decoding.
             :     //
             :     // Given that updates are expected to be uncommon and that 
most scans are
             :     // _not_ historical, the current implementation eschews 
seeking in favor of
             :     // skipping irrelevant deltas one by one.
same comment


http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/mvcc.h
File src/kudu/tablet/mvcc.h:

http://gerrit.cloudera.org:8080/#/c/11395/8/src/kudu/tablet/mvcc.h@114
PS8, Line 114:   // Returns true if 'other' represents the same time ranges as 
this snapshot,
             :   // false otherwise.
             :   bool Equals(const MvccSnapshot& other) const;
what do you mean time ranges? is other the exact same snapshot or not? e.g. 
what if other has the same all committed before and none committed after but 
different set of inflights? are they still equal? at the least this needs to be 
clear. likely HasSameTimeRangeAs is a better name?



--
To view, visit http://gerrit.cloudera.org:8080/11395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87de52092262c4b42c1bd5107f9139edfc3888b5
Gerrit-Change-Number: 11395
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Thu, 04 Oct 2018 04:24:58 +0000
Gerrit-HasComments: Yes

Reply via email to