Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-3724: Support Kudu non-covering range partitions ......................................................................
Patch Set 3: (19 comments) http://gerrit.cloudera.org:8080/#/c/4856/3/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: Line 354: 2: optional bool lower_bound_type > is_lower_bound_inclusive? Done Line 356: 4: optional bool upper_bound_type > is_upper_bound_inclusive? Done http://gerrit.cloudera.org:8080/#/c/4856/3/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: Line 1184: if (val.isEmpty()) { > I don't think this case is possible Done Line 1194: if (val.isEmpty()) { > I don't think this case is possible Done http://gerrit.cloudera.org:8080/#/c/4856/3/fe/src/main/java/org/apache/impala/analysis/DistributeParam.java File fe/src/main/java/org/apache/impala/analysis/DistributeParam.java: Line 139: // Impala does not check for overlapping range partitions; these checks are > move to function comment Done http://gerrit.cloudera.org:8080/#/c/4856/3/fe/src/main/java/org/apache/impala/analysis/RangePartition.java File fe/src/main/java/org/apache/impala/analysis/RangePartition.java: Line 53: private final boolean lowerBoundType_; > "Type" seems weird here, lowerBoundInclusive? Done Line 68: && lowerBoundValues.equals(upperBoundValues)); > lowerBoundValues == upperBoundValues Done Line 110: analyzeBoundaryValues(upperBound_, analyzer); > if (!isSingletonRange_) ? Done Line 121: "for range-partition bounds: %s", boundaryVal.toSql())); > include the cause Done http://gerrit.cloudera.org:8080/#/c/4856/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: PS3, Line 1342: // CTAS into managed Kudu tables : AnalyzesOk("create table t primary key (id) distribute by hash (id) into 3 buckets" + : " stored as kudu as select id, bool_col, tinyint_col, smallint_col, int_col, " + : "bigint_col, float_col, double_col, date_string_col, string_col " + : "from functional.alltypestiny"); : // CTAS in an external Kudu table : AnalysisError("create external table t stored as kudu " + : "tblproperties('kudu.table_name'='t') as select id, int_col from " + : "functional.alltypestiny", "CREATE TABLE AS SELECT is not supported for " + : "external Kudu tables."); : : // CTAS into Kudu tables with unsupported types : AnalysisError("create table t primary key (id) distribute by hash into 3 buckets" + : " stored as kudu as select id, timestamp_col from functional.alltypestiny", : "Cannot create table 't': Type TIMESTAMP is not supported in Kudu"); : AnalysisError("create table t primary key (cs) distribute by hash into 3 buckets" + : " stored as kudu as select cs from functional.chars_tiny", : "Cannot create table 't': Type CHAR(5) is not supported in Kudu"); : AnalysisError("create table t primary key (vc) distribute by hash into 3 buckets" + : " stored as kudu as select vc from functional.chars_tiny", : "Cannot create table 't': Type VARCHAR(32) is not supported in Kudu"); : AnalysisError("create table t primary key (id) distribute by hash into 3 buckets" + : " stored as kudu as select id, s from functional.complextypes_fileformat", : "Expr 's' in select list returns a complex type 'STRUCT<f1:STRING,f2:INT>'.\n" + : "Only scalar types are allowed in the select list."); : AnalysisError("create table t primary key (id) distribute by hash into 3 buckets" + : " stored as kudu as select id, m from functional.complextypes_fileformat", : "Expr 'm' in select list returns a complex type 'MAP<STRING,BIGINT>'.\n" + : "Only scalar types are allowed in the select list."); : AnalysisError("create table t primary key (id) distribute by hash into 3 buckets" + : " stored as kudu as select id, a from functional.complextypes_fileformat", : "Expr 'a' in select list returns a complex type 'ARRAY<INT>'.\n" + : "Only scalar types are allowed in the select list."); > We should have some CTAS coverage as well. Added a few more tests with range + hash for completion. Parameterizing these test cases may be an overkill as it doesn't really improve the code coverage. CTAS and regular CT use the same analysis path wrt distribution schemes. Line 1753: AnalysisError("create table tab (x int primary key) distribute by range (x) " + > also try a constexpr that evaluates to NULL Done Line 1755: "for range-partition bounds: x + 1"); > also add a negative test with some crazy expr like a subquery or analytic f Crazy cases give crazy errors :) Good one. Done Line 1819: // Key ranges must match the column types. > can you add neg cases for tinyint,smallint,int,bigint where the value is bi Done http://gerrit.cloudera.org:8080/#/c/4856/3/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: Line 1632: //TestUtils.assumeKuduIsSupported(); > ? hahah, ask Alex about it. Removed the comment. Sorry. Done http://gerrit.cloudera.org:8080/#/c/4856/3/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test: PS3, Line 82: create table tab (a int primary key) distributed by range (a) (partition value = false) : stored as kudu : ---- CATCH : ImpalaRuntimeException: Expected 'int32' literal for column 'a' got 'BOOLEAN' > can this be an analysis FE test? Well the checks in the analysis allow this case because boolean is implicitly castable to int, but it hits an error when we try to create the kudu range partition in the catalog. Kind of a weird edge case that someone could argue should be allowed. Thoughts? http://gerrit.cloudera.org:8080/#/c/4856/3/testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test File testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test: > 1. Can you add a case to exercise all supported key types (I believe that's Matt this is a not a valid partition syntax. You can't have multiple-column partition ranges only single value partitions (.e.g value = (1,1,1,1,"1"), value = (2,2,2,2,"2"). I added one case based on your example. Let me know if that covers the cases you have in mind. PS3, Line 21: varchar(20) > string Done Line 338: # Insert row that are not covered by any of the existing range partitions > are -> is Done Line 347: # Insert row that are not covered by any of the existing range partitions > are -> is Done -- To view, visit http://gerrit.cloudera.org:8080/4856 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-HasComments: Yes
