Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19055 )
Change subject: IMPALA-3119: DDL support for bucketed tables ...................................................................... Patch Set 10: (10 comments) Thanks for contributing this! I left some comments. http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@18 PS10, Line 18: [BUCKETED BY HASH([column [, column ...]])|RANDOM INTO 24 BUCKETS It'd be better if we can highlight this line since it's the only new part. We should also write this based on the cup file. http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@29 PS10, Line 29: 1. CLUSTERED BY of Hive is not supported, because HINT has the keyword; Do you mean "CLUSTERED" has been used as a hint so we can't use it as a keyword? I'm not sure what blocks this. Could you share the error you saw? It'd be nice to have the consistent syntax as HQL. http://gerrit.cloudera.org:8080/#/c/19055/10//COMMIT_MSG@30 PS10, Line 30: 2. The bucket partitioning algorithm contains HASH, RANDOM, KUDU_HASH. Are these recognized by Hive? i.e. if Hive inserts data into the table, is it using the hash algorithm we expected? http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup@1636 PS10, Line 1636: | opt_bucket_desc:bucket I think we don't need two switches here. Like other optional fields, we can add an empty switch in opt_bucket_desc, e.g. https://github.com/apache/impala/blob/6a1a871fb7f014be0ab9dbc0ac450416b897a263/fe/src/main/cup/sql-parser.cup#L1650-L1653 http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/cup/sql-parser.cup@1705 PS10, Line 1705: {: RESULT = new Pair<List<String>, TSortingOrder>(null, TSortingOrder.LEXICAL); :} nit: Let's skip reformatting unrelated codes. http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@329 PS10, Line 329: ensureTableNotBucketed(table); This blocks us from dropping a bucketed table. But it's ok to support dropping bucketed tables in another JIRA. http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/jflex/sql-scanner.flex File fe/src/main/jflex/sql-scanner.flex: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/main/jflex/sql-scanner.flex@178 PS10, Line 178: kuduhash The commit message mentions "kudu_hash". http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2844 PS10, Line 2844: HASH() Can we add tests for "HASH" and "KUDUHASH" without the parentheses? BUCKETED BY HASH INTO 12 BUCKETS BUCKETED BY KUDUHASH INTO 12 BUCKETS http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@a4347 PS10, Line 4347: Can we keep this since this still doesn't work? http://gerrit.cloudera.org:8080/#/c/19055/10/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3074 PS10, Line 3074: ParsesOk("CREATE TABLE bucketed_test (i int, s string) BUCKETED BY RANDOM"); Could you add some tests for KUDU_HASH? -- To view, visit http://gerrit.cloudera.org:8080/19055 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I919b4d4139bc3a7784fa6fdb6f064e25666d548e Gerrit-Change-Number: 19055 Gerrit-PatchSet: 10 Gerrit-Owner: Baike Xia <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Tue, 18 Oct 2022 12:04:53 +0000 Gerrit-HasComments: Yes
