Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19055 )

Change subject: IMPALA-3119: DDL support for bucketed tables
......................................................................


Patch Set 10:

(10 comments)

Thanks for contributing this! I left some comments.

http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@18
PS10, Line 18:  [BUCKETED BY HASH([column [, column ...]])|RANDOM INTO 24 
BUCKETS
It'd be better if we can highlight this line since it's the only new part. We 
should also write this based on the cup file.


http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@29
PS10, Line 29: 1. CLUSTERED BY of Hive is not supported, because HINT has the 
keyword;
Do you mean "CLUSTERED" has been used as a hint so we can't use it as a 
keyword? I'm not sure what blocks this. Could you share the error you saw? It'd 
be nice to have the consistent syntax as HQL.


http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@30
PS10, Line 30: 2. The bucket partitioning algorithm contains HASH, RANDOM, 
KUDU_HASH.
Are these recognized by Hive? i.e. if Hive inserts data into the table, is it 
using the hash algorithm we expected?


http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup@1636
PS10, Line 1636:   | opt_bucket_desc:bucket
I think we don't need two switches here. Like other optional fields, we can add 
an empty switch in opt_bucket_desc, e.g.
https://github.com/apache/impala/blob/6a1a871fb7f014be0ab9dbc0ac450416b897a263/fe/src/main/cup/sql-parser.cup#L1650-L1653


http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup@1705
PS10, Line 1705:   {: RESULT = new Pair<List<String>, TSortingOrder>(null, 
TSortingOrder.LEXICAL); :}
nit: Let's skip reformatting unrelated codes.


http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@329
PS10, Line 329:     ensureTableNotBucketed(table);
This blocks us from dropping a bucketed table. But it's ok to support dropping 
bucketed tables in another JIRA.


http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/jflex/sql-scanner.flex
File fe/src/main/jflex/sql-scanner.flex:

http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/jflex/sql-scanner.flex@178
PS10, Line 178: kuduhash
The commit message mentions "kudu_hash".


http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2844
PS10, Line 2844: HASH()
Can we add tests for "HASH" and "KUDUHASH" without the parentheses?

  BUCKETED BY HASH INTO 12 BUCKETS
  BUCKETED BY KUDUHASH INTO 12 BUCKETS


http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@a4347
PS10, Line 4347:
Can we keep this since this still doesn't work?


http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3074
PS10, Line 3074:     ParsesOk("CREATE TABLE bucketed_test (i int, s string) 
BUCKETED BY RANDOM");
Could you add some tests for KUDU_HASH?



--
To view, visit http://gerrit.cloudera.org:8080/19055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I919b4d4139bc3a7784fa6fdb6f064e25666d548e
Gerrit-Change-Number: 19055
Gerrit-PatchSet: 10
Gerrit-Owner: Baike Xia <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Tue, 18 Oct 2022 12:04:53 +0000
Gerrit-HasComments: Yes

Reply via email to