[Impala-ASF-CR] IMPALA-3724: Support Kudu non-covering range partitions

Dimitris Tsirogiannis (Code Review) Mon, 31 Oct 2016 11:13:22 -0700

Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-3724: Support Kudu non-covering range partitions
......................................................................



Patch Set 2:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/4856/2/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 355:   2: optional ExternalDataSource.TComparisonOp lower_bound_op
> Let's do whatever feels cleanest. Lmk if you disagree with bool and we can 
Done


http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 1192:   literal:l LESSTHAN
> We need an Expr and constant folding here to support negative ranges like "
Done


http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
File fe/src/main/java/org/apache/impala/analysis/DistributeParam.java:

Line 35:  * clause of a CREATE TABLE statement. The distribution can be 
hash-based or
> add something like "See RangePartition for details on the supported range p
Done


Line 36:  * range-based.
> or both.
Done


Line 91:   // Primary key columns of this distribution. If no columns are 
specified, all
> Columns of this distribution? Seem better to not overload 'primary key' wit
Done


Line 92:   // the primary key columns of the associated table as used.
> as -> are
Done


Line 139:       if (!lowerBound.isEmpty() && lowerBound.size() != 
colNames_.size()) {
> Are we deferring checks like conflicting unbounded ranges to Kudu? Either a
Yes, all these checks are performed by Kudu. We also have tests for this. Added 
a comment. Done


Line 141:             "partition values is different than the number of 
projected key " +
> projected key columns -> distribution columns?
Done


Line 174:           "(type: %s) is not type compatible with column '%s' (type: 
%s).",
> with distribution column
Done


Line 191:       builder.append(" (");
> missing "RANGE(col)"?
L181


http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/java/org/apache/impala/analysis/RangePartition.java
File fe/src/main/java/org/apache/impala/analysis/RangePartition.java:

Line 56:   private RangePartition(List<LiteralExpr> lowerBoundValues, 
BinaryPredicate.Operator lOp,
> Why not represent the OP as a bool inclusive/exclusive? Seems simpler with 
Done


http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java
File fe/src/main/java/org/apache/impala/util/KuduUtil.java:

Line 117:    * Builds and returns a range partition bound to be used in the 
creation of a Kudu
> remove "to be" to disambiguate ("bound" could be a interpreted as noun or "
Done


Line 167:     TExprNode literal = boundaryVal.getNodes().get(0);
> check state that boundaryVal is a literal expr
Aren't the checks below sufficient?


http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

Line 1632:     //TestUtils.assumeKuduIsSupported();
> I know why this is here :)
At least I didn't remove it completely as I did the last time :)


Line 2386:     ParserError("CREATE TABLE Foo (i int) DISTRIBUTE BY RANGE(i) 
SPLIT ROWS ((1),(2))");
> remove (there many things we don't support)
Done


Line 2408:     ParserError("CREATE TABLE Foo (a int) DISTRIBUTE BY RANGE (a) 
()");
> Test that something like >= does not parse.
I tend to think that anything that can be enforced in the grammar (as long as 
it doesn't complicate the parser) it should be enforced there. If you feel 
strongly about moving these checks in the analysis I don't mind changing it.


Line 2419:     ParserError("CREATE TABLE Foo (a int) DISTRIBUTE BY RANGE (a) " +
> add tests with PARTITION 1 < VALUE and PARTITION VALUES = 10 (mismatched co
Good point. Done


Line 2561:     ParsesOk("CREATE TABLE Foo PRIMARY KEY (a, b) DISTRIBUTE BY HASH 
(b) INTO 2 " +
> add one CTAS test with RANGE and some PARTITIONs
Done


http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test:

Line 91: (partitiion values <= 1, partition 1 < value <= 10, partition 10 < 
values) stored as kudu;
> typo: partitiion
Done


http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test:

> can you add test cases for
Done


http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test:

Line 68:   (partition values <= 10, partition 10 < values <=20, partition 20 < 
values <= 30,
> nit: space after second <=
Done


Line 127:   primary key (id, name)) distribute by hash(id) into 4 buckets, 
range(id, name)
> is there a limit on the number of range partitions?
The kudu guys say there isn't. Do you want me to add a test with a large number 
of partitions?


-- 
To view, visit http://gerrit.cloudera.org:8080/4856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-3724: Support Kudu non-covering range partitions

Reply via email to