Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12113 )
Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu tables. ...................................................................... Patch Set 11: (9 comments) This is looking pretty good, I am at a +1 once these comments are addressed. I'll let Thomas give the +2 since he built out this code originally. http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/runtime/coordinator.cc@972 PS11, Line 972: TExprNode I don't think this cast is necessary, is it? http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/util/min-max-filter.h@91 PS11, Line 91: , int precision Maybe just pass in a const ColumnType&? That would be more generic. Feel free to ignore this one. http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/util/min-max-filter.h@94 PS11, Line 94: static void Or(const TMinMaxFilter& in, TMinMaxFilter* out) { This overload only exists as a convenience for test functions right? It's confusing because product code doesn't call it. I think it would be better to remove this overload and update the tests to pass in a dummy argument. http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/util/min-max-filter.h@247 PS11, Line 247: #define GET_MINMAX(MIN_OR_MAX) \ Let's stop this macro leaking out of the header. We use push_macro/pop_macro like below in some other places in the code. #pragma push_macro("GET_MINMAX") #define GET_MINMAX .... .... #pragma pop_macro("GET_MINMAX") http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/12113/11/be/src/util/min-max-filter.cc@492 PS11, Line 492: if (in.always_false) nit: our style is to use braces around the if body (except if the whole if statement fits on a single line). http://gerrit.cloudera.org:8080/#/c/12113/11/testdata/data/README File testdata/data/README: http://gerrit.cloudera.org:8080/#/c/12113/11/testdata/data/README@293 PS11, Line 293: with decimal types on Kudu. Can you mention how the data was generated, in case someone wants to reproduce something similar? http://gerrit.cloudera.org:8080/#/c/12113/11/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/12113/11/testdata/datasets/functional/functional_schema_template.sql@1854 PS11, Line 1854: -- Small table with decimal values. This is used for I guess the handling of comments in these files is a bit broken - it's including the comment as part of the table name for some purposes, e.g. the below command wasn't working to load the table data: ./bin/load-data.py -w functional-query --table_names=decimal_rtf_tiny_tbl --table_formats=text/none,kudu/none -f I could work around it by moving the comment to the DATASET section, although I don't think that fixes the underlying problem that comments aren't stripped. So maybe move the comment to be consistent with the rest of the file, but we can probably skip messing with the file parsing code as part of this patch. http://gerrit.cloudera.org:8080/#/c/12113/11/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test: PS11: As a hack to validate the results, I modified the test code to run the same tests against text, ignoring the RUNTIME_PROFILE sections: diff --git a/tests/common/impala_test_suite.py b/tests/common/impala_test_suite.py index a4db757..765365b 100644 --- a/tests/common/impala_test_suite.py +++ b/tests/common/impala_test_suite.py @@ -512,11 +512,12 @@ class ImpalaTestSuite(BaseTestSuite): rt_profile_info = 'RUNTIME_PROFILE' if rt_profile_info is not None: - rt_profile = verify_runtime_profile(test_section[rt_profile_info], - result.runtime_profile, - update_section=pytest.config.option.update_results) - if pytest.config.option.update_results: - test_section[rt_profile_info] = "".join(rt_profile) + #rt_profile = verify_runtime_profile(test_section[rt_profile_info], + # result.runtime_profile, + # update_section=pytest.config.option.update_results) + #if pytest.config.option.update_results: + # test_section[rt_profile_info] = "".join(rt_profile) + pass if 'DML_RESULTS' in test_section: assert 'ERRORS' not in test_section diff --git a/tests/query_test/test_runtime_filters.py b/tests/query_test/test_runtime_filters.py index f769224..f331da4 100644 --- a/tests/query_test/test_runtime_filters.py +++ b/tests/query_test/test_runtime_filters.py @@ -107,7 +107,7 @@ class TestMinMaxFilters(ImpalaTestSuite): super(TestMinMaxFilters, cls).add_test_dimensions() # Min-max filters are only implemented for Kudu. cls.ImpalaTestMatrix.add_constraint( - lambda v: v.get_value('table_format').file_format in ['kudu']) + lambda v: v.get_value('table_format').file_format in ['text']) def test_min_max_filters(self, vector): self.run_test_case('QueryTest/min_max_filters', vector) http://gerrit.cloudera.org:8080/#/c/12113/11/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@2789 PS11, Line 2789: # Test case 9.1: DecimalMinMaxFilters with left semi join - 4 bytes I think it would be good to break out this sections into separate .test files called from different python functions - that would let them run in parallel and make it a bit easier to navigate the tests. -- To view, visit http://gerrit.cloudera.org:8080/12113 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94 Gerrit-Change-Number: 12113 Gerrit-PatchSet: 11 Gerrit-Owner: Janaki Lahorani <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 26 Dec 2018 16:02:29 +0000 Gerrit-HasComments: Yes
