Alex Behm has posted comments on this change. Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables ......................................................................
Patch Set 7: (10 comments) Responses to comments. Doing another round now. http://gerrit.cloudera.org:8080/#/c/4414/7/be/src/service/frontend.cc File be/src/service/frontend.cc: Line 62: "value should be a comma separated list of hostnames or IP addresses."); > Ports are optional. Add that detail to the description. http://gerrit.cloudera.org:8080/#/c/4414/7/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: Line 358: 1: required list<TRangeLiteral> values > Hm, TExpr? That sounds a bit too much for simple literal values unless we e Why add a new special purpose structure that is redundant with TExpr? To me a TExpr is pretty clear, but without looking at the definition, I don't really know what a TRangeLiteral is. http://gerrit.cloudera.org:8080/#/c/4414/7/fe/src/main/java/org/apache/impala/analysis/DistributeParam.java File fe/src/main/java/org/apache/impala/analysis/DistributeParam.java: Line 105: for (String colName: colNames_) { > Not sure I follow. Are we doing something to prevent this? I was wondering whether we should allow something like DISTRIBUTED BY(a, a, a, a, a) or whether we should reject that because 'a' is mentioned multiple times. We reject such cases for PKs. http://gerrit.cloudera.org:8080/#/c/4414/7/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java: Line 231: } catch (Exception e) { > I don't think we can differentiate these two given the way the Kudu client Got it, thanks! http://gerrit.cloudera.org:8080/#/c/4414/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: Line 1350: // CTAS in an external Kudu table > Hm, what does it really mean to allow CTAS on external Kudu table? We have I was thinking it would mean the same thing it does for HDFS tables. After the CTAS we have a new external table with a schema according to the query statement. The fact that it's a Kudu and not an HDFS table seems like an insignificant detail from a user's point of view. In other words, at a high-level it's not very clear to me why Kudu would behave any different than HDFS. I get that the implementation may be weird. Let's revisit in a follow-on change, if necessary. Line 1739: AnalysisError(String.format("create table tab (x int) distribute by hash (x) " + > Well these were never specified in TBLPROPERTIES so I am not sure if we sho I see. Most of the metadata is in Kudu now. Nice! http://gerrit.cloudera.org:8080/#/c/4414/6/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: Line 203: if row_format and file_format == 'text': > Hm, I couldn't find anything in the docs or in our tests. There are a few examples here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL No idea whether we'll actually do the right thing http://gerrit.cloudera.org:8080/#/c/4414/7/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 204: CREATE TABLE {table} (c INT PRIMARY KEY) > That's interesting. Actually, currently don't work because we can't persist File JIRA and defer. Line 213: TBLPROPERTIES ('kudu.master_addresses'='{kudu_addr}')""".format( > Good question. If the user doesn't specify the kudu master_addresses we pop Thanks for the explanation, seems fine. Line 226: DISTRIBUTE BY HASH (c) INTO 3 BUCKETS, RANGE (c) SPLIT ROWS (...) > No, but Kudu's API doesn't currently return any information about range par Got it, thanks. -- To view, visit http://gerrit.cloudera.org:8080/4414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]> Gerrit-HasComments: Yes
