helifu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11381 )

Change subject: KUDU-2566: Enhance rowset tree pruning and stop copying strings
......................................................................


Patch Set 9:

(47 comments)

thanks to Adar and Will.

http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG@7
PS2, Line 7: Enhance rowset tr
> Could you make the subject more descriptive? Howabout "[tablet] KUDU-2566:
Done


http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG@10
PS2, Line 10: Support open-ended intervals
> Following good commit message guidelines, could you explain what the previo
Done


http://gerrit.cloudera.org:8080/#/c/11381/2//COMMIT_MSG@11
PS2, Line 11: Previously, the rowset tree could only compute overlap with a
> Again, let's fill out the explanation here a bit. My suggestion (adding on
Done


http://gerrit.cloudera.org:8080/#/c/11381/5//COMMIT_MSG
Commit Message:

PS5:
> Thanks for rewriting the commit message. It's really nice now.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5//COMMIT_MSG@7
PS5, Line 7: stop copying strings
> nit: "stop copying strings"
Done


http://gerrit.cloudera.org:8080/#/c/11381/5//COMMIT_MSG@11
PS5, Line 11:
> nit: Could you wrap commit message lines at 72 characters?
Done


http://gerrit.cloudera.org:8080/#/c/11381/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11381/6//COMMIT_MSG@8
PS6, Line 8:
> I think git only uses the first line as the subject so inserting a line bre
Done


http://gerrit.cloudera.org:8080/#/c/11381/4/src/kudu/tablet/rowset_tree-test.cc
File src/kudu/tablet/rowset_tree-test.cc:

http://gerrit.cloudera.org:8080/#/c/11381/4/src/kudu/tablet/rowset_tree-test.cc@18
PS4, Line 18: #include <algorithm>
> warning: #includes are not sorted properly [llvm-include-order]
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree-test.cc
File src/kudu/tablet/rowset_tree-test.cc:

http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree-test.cc@211
PS5, Line 211: domized
> By the convention of the Google style guide, we use references arguments on
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree-test.cc@218
PS5, Line 218:
> I think we want coverage of the degenerate case when s1 == s2 (so the inter
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree-test.cc@217
PS5, Line 217: UND_EQUAL
             :   };
             :   const auto& GetStringPair = [] (const BoundOperator op) -> 
std::pair<string, string> {
             :     while (true) {
             :       string
> You can simplify this a little bit to
Done


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/tablet/rowset_tree-test.cc
File src/kudu/tablet/rowset_tree-test.cc:

http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/tablet/rowset_tree-test.cc@251
PS6, Line 251:   vector<RowSet*> out;
> Don't need to SeedRandom() twice in a test.
Done


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/tablet/rowset_tree-test.cc@283
PS6, Line 283:         testing::Values(10, 100, 250, 500),
> Should either fix this test case, or remove it.
Done


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/tablet/rowset_tree.h
File src/kudu/tablet/rowset_tree.h:

http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/tablet/rowset_tree.h@96
PS6, Line 96:   //  [lower_bound, upper_bound)
> Nit: got a tab here.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.h
File src/kudu/tablet/rowset_tree.h:

http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.h@90
PS5, Line 90: en 'lower_bound' is boost::none
> Can you add a comment explaining what it means to pass boost::none for 'low
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.h@90
PS5, Line 90:
> style nit: We put the &'s and *'s for refs and pointers by the type, not th
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.h@91
PS5, Line 91:
> Likewise.
Done


http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/tablet/rowset_tree.h
File src/kudu/tablet/rowset_tree.h:

http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/tablet/rowset_tree.h@88
PS2, Line 88:                                    const 
std::function<void(RowSet*, int)>& cb) const;
            :
            :   // When 'lower_bound' is boost::none, it means negative 
infinity.
            :   // When 'upper_bound' is boost::none, it means positive 
infinity.
            :   // So the query interval can be one of below:
            :   //  [-OO, +OO)
            :   //  [-OO, upper_bound)
> +1. We could also accept boost::none for both arguments and short-circuit r
some tests, example TestTablet.TestRowIteratorSimple in tablet-test.cc, will 
rely on rowsets's sequence, so we need keep the original logic.


http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/tablet/rowset_tree.h@88
PS2, Line 88:                                    const 
std::function<void(RowSet*, int)>& cb) const;
            :
            :   // When 'lower_bound' is boost::none, it means negative 
infinity.
            :   // When 'upper_bound' is boost::none, it means positive 
infinity.
            :   // So the query interval can be one of below:
            :   //  [-OO, +OO)
            :   //  [-OO, upper_bound)
> Rather than introducing new FindRowSetsIntersectingInterval... methods, cou
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc
File src/kudu/tablet/rowset_tree.cc:

http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@107
PS5, Line 107: &
> Likewise with the & placement nit.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@108
PS5, Line 108: &
> Likewise with the & placement nit.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@109
PS5, Line 109: const
> I don't think the const specifier is meaningful on a bool passed by value.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@117
PS5, Line 117: &
> Likewise with the & placement nit.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@118
PS5, Line 118: &
> Likewise with the & placement nit.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@119
PS5, Line 119: const
> Ditto on the const specifier.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@194
PS5, Line 194: &
> Likewise with the & placement nit.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@195
PS5, Line 195: &
> Likewise with the & placement nit.
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/tablet/rowset_tree.cc@196
PS5, Line 196: *
> * next to type, not parameter name.
Done


http://gerrit.cloudera.org:8080/#/c/11381/3/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11381/3/src/kudu/tablet/tablet.cc@1773
PS3, Line 1773:
              :   // Grab the memrowset iterator.
              :   gscoped_ptr<RowwiseIterator> ms_iter;
              :   RETURN_NOT_OK(components_->memrowset->NewRowIterator(opts, 
&ms_iter));
              :   ret.emplace_back(ms_iter.release());
              :
              :   opts.io_context = io_context;
              :
should check spec != nullptr first, because some test cases's spec is nullptr.


http://gerrit.cloudera.org:8080/#/c/11381/3/src/kudu/tablet/tablet.cc@1782
PS3, Line 1782: if (spec != nullptr && (spec->lower_bound_key() || 
spec->exclusive_upper_bound_key())) {
              :     boost::optional<Slice> lower_bound = 
spec->lower_bound_key() ? \
              :         
boost::optional<Slice>(spec->lower_bound_key()->encoded_key()) : boost::none;
              :     boost::optional<Slice> upper_bound = 
spec->exclusive_upper_bound_key() ? \
              :         
boost::optional<Slice>(spec->exclusive_upper_bound_key()->encoded_key()) : 
boost::none;
              :     vector<RowSet*> interval_sets;
              :     
components_->rowsets->FindRowSetsIntersectingInterval(lower_bound, upper_bound, 
&interval_sets);
              :
some tests, example TestTablet.TestRowIteratorSimple in tablet-test.cc, will 
rely on rowsets's sequence, so we need keep the original logic.


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/tablet/tablet.cc@1774
PS6, Line 1774:   // Grab the memrowset iterator.
              :   gscoped_ptr<RowwiseIterator> ms_iter;
              :   RETURN_NOT_OK(components_->memrowset->NewRowIterator(opts, 
&ms_iter));
              :   ret.emplace_back(ms_iter.release());
              :
              :   opts.io_context = io_context;
              :
              :   //
> Nit: combine using ternaries:
Done


http://gerrit.cloudera.org:8080/#/c/11381/7/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/11381/7/src/kudu/tablet/tablet.cc@1793
PS7, Line 1793:                                      rs->ToString()));
              :       ret.emplace_back(row_it.release());
> Can remove this.
Done


http://gerrit.cloudera.org:8080/#/c/11381/7/src/kudu/tablet/tablet.cc@2362
PS7, Line 2362:
> nit: extra line
Done


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/util/interval_tree-inl.h
File src/kudu/util/interval_tree-inl.h:

http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/util/interval_tree-inl.h@402
PS6, Line 402: interval
> Nit: 'intervals' was correct; this deals with multiple intervals.
"interval ... is ... "  | "intervals ... are ... ", maybe the former is simple 
:)


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/util/interval_tree-inl.h@402
PS6, Line 402: interval
> I think 'interval' can be singular or plural as long as the rest of the sen
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/util/interval_tree-inl.h
File src/kudu/util/interval_tree-inl.h:

http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/util/interval_tree-inl.h@402
PS5, Line 402: interval
> nit: interval
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/util/interval_tree-inl.h@409
PS5, Line 409: POSI
> To make this easier to understand locally, could you do one of two things:
Done


http://gerrit.cloudera.org:8080/#/c/11381/1/src/kudu/util/interval_tree-test.cc
File src/kudu/util/interval_tree-test.cc:

http://gerrit.cloudera.org:8080/#/c/11381/1/src/kudu/util/interval_tree-test.cc@66
PS1, Line 66:       //            |     |
> warning: redundant boolean literal in conditional return statement [readabi
Done


http://gerrit.cloudera.org:8080/#/c/11381/1/src/kudu/util/interval_tree-test.cc@71
PS1, Line 71:       // [-OO,    upper)
> warning: redundant boolean literal in conditional return statement [readabi
Done


http://gerrit.cloudera.org:8080/#/c/11381/1/src/kudu/util/interval_tree-test.cc@223
PS1, Line 223:     tree.FindIntersectingInterval(lower, upper, &results);
> warning: the const qualified parameter 'all_intervals' is copied for each i
Done


http://gerrit.cloudera.org:8080/#/c/11381/1/src/kudu/util/interval_tree-test.cc@238
PS1, Line 238:   }
> warning: the const qualified parameter 'all_intervals' is copied for each i
Done


http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/util/interval_tree-test.cc
File src/kudu/util/interval_tree-test.cc:

http://gerrit.cloudera.org:8080/#/c/11381/5/src/kudu/util/interval_tree-test.cc@64
PS5, Line 64: wer == boost::none &
> I'm getting confused now about which intervals are inclusive of the right e
Done


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/util/interval_tree-test.cc
File src/kudu/util/interval_tree-test.cc:

http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/util/interval_tree-test.cc@59
PS6, Line 59: // boost::none means infinity.
            :   // [left,  right] is closed interval.
            :   // [lower, upper) is half-open interval, so the upper is 
exclusive.
> This and the pictures are great. Thanks for adding them!
Done


http://gerrit.cloudera.org:8080/#/c/11381/7/src/kudu/util/interval_tree-test.cc
File src/kudu/util/interval_tree-test.cc:

http://gerrit.cloudera.org:8080/#/c/11381/7/src/kudu/util/interval_tree-test.cc@187
PS7, Line 187: &
> nit: Here and below, & with type.
Done


http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/util/interval_tree.h
File src/kudu/util/interval_tree.h:

http://gerrit.cloudera.org:8080/#/c/11381/6/src/kudu/util/interval_tree.h@145
PS6, Line 145:   void FindIntersectingInterval(const QueryPointType& 
lower_bound,
             :                                 const QueryPointType& 
upper_bound,
> Nit: const QueryPointType&
Done


http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/util/interval_tree.h
File src/kudu/util/interval_tree.h:

http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/util/interval_tree.h@140
PS2, Line 140:
             :   // Find all intervals in the tree which intersect the given 
interval.
             :   // The resulting intervals are added to the 'results' vector.
             :   // The vector is not cleared first.
             :   template<class QueryPointType>
             :   void FindIntersectingInterval(const QueryPointType& 
lower_bound,
             :                                 const QueryPointType& 
upper_bound,
             :                                 IntervalVector* results) const;
             :  private:
             :   static void Partition(const IntervalVector &in,
             :
> Rather than introduce these new methods, would it be possible to modify int
Done


http://gerrit.cloudera.org:8080/#/c/11381/2/src/kudu/util/interval_tree.h@140
PS2, Line 140:
             :   // Find all intervals in the tree which intersect the given 
interval.
             :   // The resulting intervals are added to the 'results' vector.
             :   // The vector is not cleared first.
             :   template<class QueryPointType>
             :   void FindIntersectingInterval(const QueryPointType& 
lower_bound,
             :                                 const QueryPointType& 
upper_bound,
             :                                 IntervalVector* results) const;
             :  private:
             :   static void Partition(const IntervalVector &in,
             :
> +1 to the idea though I think it might suffice to add an additional QueryIn
Done



--
To view, visit http://gerrit.cloudera.org:8080/11381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e34b4169f93f3519f694c9bf86ca5295c318e79
Gerrit-Change-Number: 11381
Gerrit-PatchSet: 9
Gerrit-Owner: helifu <hzhel...@corp.netease.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: helifu <hzhel...@corp.netease.com>
Gerrit-Comment-Date: Wed, 31 Oct 2018 07:23:34 +0000
Gerrit-HasComments: Yes

Reply via email to