Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/12113 )
Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu tables. ...................................................................... Patch Set 13: (15 comments) http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/coordinator.cc@969 PS13, Line 969: // Decimal min-max filter needs the precision information to do Seems like sort of a weird place for this comment to me. Maybe move it to above the definition of Or() in class MinMaxFilter in min-max-filter.h, eg. something like: 'columnType' is needed for DecimalMinMaxFilter http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.h File be/src/runtime/decimal-value.h: http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.h@62 PS13, Line 62: // tValue coming through thrift contains the value_ : // that can be used to construct decimal value I don't understand this comment, eg. what are tValue (I assume this is a capitalization typo) or value_? Also note we put single quotes around variable names and wrap at 90 chars. How about something like: // Returns a new DecimalValue created from the value in 'tvalue'. http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/runtime/decimal-value.inline.h@63 PS13, Line 63: // tValue coming through thrift contains the value_ Comment can be removed, covered in the header. http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-ir.cc File be/src/util/min-max-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-ir.cc@93 PS13, Line 93: // Decimal values can be 4 bytes, 8 bytes or 16 bytes. Depending on the size This comment can be removed. This info is covered by the comments in min-max-filter.h http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc File be/src/util/min-max-filter-test.cc: http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@376 PS13, Line 376: void CheckDecimalVals( : MinMaxFilter* filter, const Decimal4Value& min, const Decimal4Value& max) { Why not put all of this, including the EXPECT_FALSEs below, into the macro? http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@404 PS13, Line 404: timestamp decimal? http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@406 PS13, Line 406: MinMaxFilter* empty_filter = : MinMaxFilter::Create(*tFilter, column_type, obj_pool, mem_tracker); : EXPECT_TRUE(empty_filter->AlwaysFalse()); : EXPECT_FALSE(empty_filter->AlwaysTrue()); : empty_filter->Close(); Is this testing anything that isn't covered by the equivalent checks on 'filter' above? http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter-test.cc@413 PS13, Line 413: #define DECIMAL_INSERT_AND_CHECK(SIZE, PRECISION, SCALE, VALUE1, VALUE2, VALUE3) \ Could you add a brief comment explaining a bit about how this works, eg. VALUE1, VALUE2, and VALUE3 are expected to have a specific ordering. http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h@264 PS13, Line 264: size nit: the size http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h@265 PS13, Line 265: nit: extra space http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.h@266 PS13, Line 266: (just as it is done for other min-max filters) unnecessary to mention http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc@399 PS13, Line 399: construct nit: Construct http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc@451 PS13, Line 451: thrift->__isset.min = true; : thrift->__isset.max = true; This should be moved up into the 'if (!always_false_)' block above http://gerrit.cloudera.org:8080/#/c/12113/13/be/src/util/min-max-filter.cc@519 PS13, Line 519: if (in.always_false) return; The other definitions of ::Copy for the other filter types don't have this path, which makes me concerned there's a bug. Could you figure out if this needs to be removed, or if (I assume) this path needs to be added to the other Copy()s? http://gerrit.cloudera.org:8080/#/c/12113/13/tests/query_test/test_runtime_filters.py File tests/query_test/test_runtime_filters.py: http://gerrit.cloudera.org:8080/#/c/12113/13/tests/query_test/test_runtime_filters.py@116 PS13, Line 116: self.run_test_case('QueryTest/decimal_min_max_filters', vector) Given this .test file is massive, how long does this take to run? If its more than about 5s we may want to restrict this to running in exhaustive. Also, what's the distinction between the tests that are run here and the tests that you added to min_max_filters.test? -- To view, visit http://gerrit.cloudera.org:8080/12113 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94 Gerrit-Change-Number: 12113 Gerrit-PatchSet: 13 Gerrit-Owner: Janaki Lahorani <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Janaki Lahorani <[email protected]> Gerrit-Reviewer: Thomas Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Fri, 04 Jan 2019 23:56:37 +0000 Gerrit-HasComments: Yes
