[Impala-ASF-CR] IMPALA-6533: Add min-max filter for decimal types on kudu tables.

Thomas Marshall (Code Review) Fri, 04 Jan 2019 15:56:47 -0800

Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12113 )


Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu 
tables.
......................................................................


Patch Set 13:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/coordinator.cc@969
PS13, Line 969:       // Decimal min-max filter needs the precision information 
to do
Seems like sort of a weird place for this comment to me. Maybe move it to above 
the definition of Or() in class MinMaxFilter in min-max-filter.h, eg. something 
like:

'columnType' is needed for DecimalMinMaxFilter


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.h
File be/src/runtime/decimal-value.h:

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.h@62
PS13, Line 62: // tValue coming through thrift contains the value_
             :   // that can be used to construct decimal value
I don't understand this comment, eg. what are tValue (I assume this is a 
capitalization typo) or value_? Also note we put single quotes around variable 
names and wrap at 90 chars.

How about something like:
// Returns a new DecimalValue created from the value in 'tvalue'.


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.inline.h@63
PS13, Line 63: // tValue coming through thrift contains the value_
Comment can be removed, covered in the header.


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-ir.cc
File be/src/util/min-max-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-ir.cc@93
PS13, Line 93: // Decimal values can be 4 bytes, 8 bytes or 16 bytes.  
Depending on the size
This comment can be removed. This info is covered by the comments in 
min-max-filter.h


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc
File be/src/util/min-max-filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@376
PS13, Line 376: void CheckDecimalVals(
              :     MinMaxFilter* filter, const Decimal4Value& min, const 
Decimal4Value& max) {
Why not put all of this, including the EXPECT_FALSEs below, into the macro?


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@404
PS13, Line 404: timestamp
decimal?


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@406
PS13, Line 406: MinMaxFilter* empty_filter =
              :       MinMaxFilter::Create(*tFilter, column_type, obj_pool, 
mem_tracker);
              :   EXPECT_TRUE(empty_filter->AlwaysFalse());
              :   EXPECT_FALSE(empty_filter->AlwaysTrue());
              :   empty_filter->Close();
Is this testing anything that isn't covered by the equivalent checks on 
'filter' above?


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@413
PS13, Line 413: #define DECIMAL_INSERT_AND_CHECK(SIZE, PRECISION, SCALE, 
VALUE1, VALUE2, VALUE3)     \
Could you add a brief comment explaining a bit about how this works, eg. 
VALUE1, VALUE2, and VALUE3 are expected to have a specific ordering.


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h@264
PS13, Line 264: size
nit: the size


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h@265
PS13, Line 265:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h@266
PS13, Line 266: (just as it is done for other min-max filters)
unnecessary to mention


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc
File be/src/util/min-max-filter.cc:

http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc@399
PS13, Line 399: construct
nit: Construct


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc@451
PS13, Line 451: thrift->__isset.min = true;
              :   thrift->__isset.max = true;
This should be moved up into the 'if (!always_false_)' block above


http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc@519
PS13, Line 519:   if (in.always_false) return;
The other definitions of ::Copy for the other filter types don't have this 
path, which makes me concerned there's a bug.

Could you figure out if this needs to be removed, or if (I assume) this path 
needs to be added to the other Copy()s?


http://gerrit.cloudera.org:8080/#/c/12113/13/tests/query_test/test_runtime_filters.py
File tests/query_test/test_runtime_filters.py:

http://gerrit.cloudera.org:8080/#/c/12113/13/tests/query_test/test_runtime_filters.py@116
PS13, Line 116:     self.run_test_case('QueryTest/decimal_min_max_filters', 
vector)
Given this .test file is massive, how long does this take to run? If its more 
than about 5s we may want to restrict this to running in exhaustive.

Also, what's the distinction between the tests that are run here and the tests 
that you added to min_max_filters.test?



--
To view, visit http://gerrit.cloudera.org:8080/12113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94
Gerrit-Change-Number: 12113
Gerrit-PatchSet: 13
Gerrit-Owner: Janaki Lahorani <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Janaki Lahorani <[email protected]>
Gerrit-Reviewer: Thomas Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 04 Jan 2019 23:56:37 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6533: Add min-max filter for decimal types on kudu tables.

Reply via email to