Baike Xia has posted comments on this change. ( http://gerrit.cloudera.org:8080/19055 )
Change subject: IMPALA-3119: DDL support for bucketed tables ...................................................................... Patch Set 11: (9 comments) http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@18 PS10, Line 18: CREATE TABLE tbl (i int COMMENT 'hello', s string) > It'd be better if we can highlight this line since it's the only new part. Got. http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@29 PS10, Line 29: the hash partition is equivalent to a bucket, > Do you mean "CLUSTERED" has been used as a hint so we can't use it as a key If add supported "CLUSTERED" in cup file, execute SQL with CLUTERED in hint and an error will be reported.I.E. execute sql - "create /* +CLUSTERED */ test as select * from tpcds.item;", error messge: ` Query: create /* +CLUSTERED */ test as select * from tpcds.item Query submitted at: 2022-10-24 09:09:52 (Coordinator: http://d403ca04eda0:25000) ERROR: ParseException: Syntax error in line 1: create /* +CLUSTERED */ test ^ Encountered: CLUSTERED Expected: STRAIGHT_JOIN, COMMA, IDENTIFIER CAUSED BY: Exception: Syntax error ` http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@30 PS10, Line 30: and the optimization rule applies to join query; > Are these recognized by Hive? i.e. if Hive inserts data into the table, is If HASH is used, the behavior is the same as hive. If not, the hive behavior is incompatible with the Hive behavior. If Hive inserts data into the table, it's considered a HASH, which is what we expect. Multiple bucket hash functions are used because hive's bucket hash algorithm is different from kudu's bucket hash algorithm. To be compatible with bucket join optimization in kudu table, multiple bucket hash functions are used. In other words, the kudu table is not supported in HASH mode. Using KUDU_HASH, however, results in tabular forms not being recognized by computing engines other than impala. http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup@1636 PS10, Line 1636: | opt_bucket_desc:bucket > I think we don't need two switches here. Like other optional fields, we can Yes, but adding empty to opt_bucket_desc causes a compilation error. So I took this approach. Or, can you give me some advice? ` Warning : *** Reduce/Reduce conflict found in state #1587 between opt_bucket_desc ::= (*) and opt_sort_cols ::= (*) under symbols: {} Resolved in favor of the first production. Warning : *** Shift/Reduce conflict found in state #1587 between opt_bucket_desc ::= (*) and opt_sort_cols ::= (*) KW_SORT KW_BY KW_ZORDER LPAREN opt_ident_list RPAREN and opt_sort_cols ::= (*) KW_SORT KW_BY LPAREN opt_ident_list RPAREN and opt_sort_cols ::= (*) KW_SORT KW_BY KW_LEXICAL LPAREN opt_ident_list RPAREN under symbol KW_SORT Resolved in favor of shifting. Warning : *** Shift/Reduce conflict found in state #1587 between opt_sort_cols ::= (*) and opt_sort_cols ::= (*) KW_SORT KW_BY KW_ZORDER LPAREN opt_ident_list RPAREN and opt_sort_cols ::= (*) KW_SORT KW_BY LPAREN opt_ident_list RPAREN and opt_sort_cols ::= (*) KW_SORT KW_BY KW_LEXICAL LPAREN opt_ident_list RPAREN under symbol KW_SORT Resolved in favor of shifting. ` http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup@1705 PS10, Line 1705: {: RESULT = new Pair<List<String>, TSortingOrder>(null, TSortingOrder.LEXICAL); :} > nit: Let's skip reformatting unrelated codes. I Got. http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/jflex/sql-scanner.flex File fe/src/main/jflex/sql-scanner.flex: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/jflex/sql-scanner.flex@178 PS10, Line 178: kudu_has > The commit message mentions "kudu_hash". Done http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2844 PS10, Line 2844: ber mu > Can we add tests for "HASH" and "KUDUHASH" without the parentheses? Yes, I can. http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@a4347 PS10, Line 4347: > Can we keep this since this still doesn't work? Yes, we can keep this since. This was taken off when I tried clustered. http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3074 PS10, Line 3074: ParsesOk("CREATE TABLE bucketed_test (i int COMMENT 'hello', s string) " + > Could you add some tests for KUDU_HASH? Done -- To view, visit http://gerrit.cloudera.org:8080/19055 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I919b4d4139bc3a7784fa6fdb6f064e25666d548e Gerrit-Change-Number: 19055 Gerrit-PatchSet: 11 Gerrit-Owner: Baike Xia <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Baike Xia <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Mon, 24 Oct 2022 12:16:54 +0000 Gerrit-HasComments: Yes
