Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-3724: Support Kudu non-covering range partitions ......................................................................
Patch Set 2: (22 comments) http://gerrit.cloudera.org:8080/#/c/4856/2/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: Line 355: 2: optional ExternalDataSource.TComparisonOp lower_bound_op > Let's do whatever feels cleanest. Lmk if you disagree with bool and we can Done http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: Line 1192: literal:l LESSTHAN > We need an Expr and constant folding here to support negative ranges like " Done http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/java/org/apache/impala/analysis/DistributeParam.java File fe/src/main/java/org/apache/impala/analysis/DistributeParam.java: Line 35: * clause of a CREATE TABLE statement. The distribution can be hash-based or > add something like "See RangePartition for details on the supported range p Done Line 36: * range-based. > or both. Done Line 91: // Primary key columns of this distribution. If no columns are specified, all > Columns of this distribution? Seem better to not overload 'primary key' wit Done Line 92: // the primary key columns of the associated table as used. > as -> are Done Line 139: if (!lowerBound.isEmpty() && lowerBound.size() != colNames_.size()) { > Are we deferring checks like conflicting unbounded ranges to Kudu? Either a Yes, all these checks are performed by Kudu. We also have tests for this. Added a comment. Done Line 141: "partition values is different than the number of projected key " + > projected key columns -> distribution columns? Done Line 174: "(type: %s) is not type compatible with column '%s' (type: %s).", > with distribution column Done Line 191: builder.append(" ("); > missing "RANGE(col)"? L181 http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/java/org/apache/impala/analysis/RangePartition.java File fe/src/main/java/org/apache/impala/analysis/RangePartition.java: Line 56: private RangePartition(List<LiteralExpr> lowerBoundValues, BinaryPredicate.Operator lOp, > Why not represent the OP as a bool inclusive/exclusive? Seems simpler with Done http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java File fe/src/main/java/org/apache/impala/util/KuduUtil.java: Line 117: * Builds and returns a range partition bound to be used in the creation of a Kudu > remove "to be" to disambiguate ("bound" could be a interpreted as noun or " Done Line 167: TExprNode literal = boundaryVal.getNodes().get(0); > check state that boundaryVal is a literal expr Aren't the checks below sufficient? http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: Line 1632: //TestUtils.assumeKuduIsSupported(); > I know why this is here :) At least I didn't remove it completely as I did the last time :) Line 2386: ParserError("CREATE TABLE Foo (i int) DISTRIBUTE BY RANGE(i) SPLIT ROWS ((1),(2))"); > remove (there many things we don't support) Done Line 2408: ParserError("CREATE TABLE Foo (a int) DISTRIBUTE BY RANGE (a) ()"); > Test that something like >= does not parse. I tend to think that anything that can be enforced in the grammar (as long as it doesn't complicate the parser) it should be enforced there. If you feel strongly about moving these checks in the analysis I don't mind changing it. Line 2419: ParserError("CREATE TABLE Foo (a int) DISTRIBUTE BY RANGE (a) " + > add tests with PARTITION 1 < VALUE and PARTITION VALUES = 10 (mismatched co Good point. Done Line 2561: ParsesOk("CREATE TABLE Foo PRIMARY KEY (a, b) DISTRIBUTE BY HASH (b) INTO 2 " + > add one CTAS test with RANGE and some PARTITIONs Done http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test: Line 91: (partitiion values <= 1, partition 1 < value <= 10, partition 10 < values) stored as kudu; > typo: partitiion Done http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test File testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test: > can you add test cases for Done http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test File testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test: Line 68: (partition values <= 10, partition 10 < values <=20, partition 20 < values <= 30, > nit: space after second <= Done Line 127: primary key (id, name)) distribute by hash(id) into 4 buckets, range(id, name) > is there a limit on the number of range partitions? The kudu guys say there isn't. Do you want me to add a test with a large number of partitions? -- To view, visit http://gerrit.cloudera.org:8080/4856 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-HasComments: Yes
