Alex Behm has posted comments on this change.

Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
......................................................................


Patch Set 7:

(10 comments)

Responses to comments. Doing another round now.

http://gerrit.cloudera.org:8080/#/c/4414/7/be/src/service/frontend.cc
File be/src/service/frontend.cc:

Line 62:     "value should be a comma separated list of hostnames or IP 
addresses.");
> Ports are optional.
Add that detail to the description.


http://gerrit.cloudera.org:8080/#/c/4414/7/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 358:   1: required list<TRangeLiteral> values
> Hm, TExpr? That sounds a bit too much for simple literal values unless we e
Why add a new special purpose structure that is redundant with TExpr? To me a 
TExpr is pretty clear, but without looking at the definition, I don't really 
know what a TRangeLiteral is.


http://gerrit.cloudera.org:8080/#/c/4414/7/fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
File fe/src/main/java/org/apache/impala/analysis/DistributeParam.java:

Line 105:     for (String colName: colNames_) {
> Not sure I follow. Are we doing something to prevent this?
I was wondering whether we should allow something like DISTRIBUTED BY(a, a, a, 
a, a) or whether we should reject that because 'a' is mentioned multiple times. 
We reject such cases for PKs.


http://gerrit.cloudera.org:8080/#/c/4414/7/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java:

Line 231:     } catch (Exception e) {
> I don't think we can differentiate these two given the way the Kudu client 
Got it, thanks!


http://gerrit.cloudera.org:8080/#/c/4414/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

Line 1350:     // CTAS in an external Kudu table
> Hm, what does it really mean to allow CTAS on external Kudu table? We have 
I was thinking it would mean the same thing it does for HDFS tables. After the 
CTAS we have a new external table with a schema according to the query 
statement. The fact that it's a Kudu and not an HDFS table seems like an 
insignificant detail from a user's point of view. In other words, at a 
high-level it's not very clear to me why Kudu would behave any different than 
HDFS.

I get that the implementation may be weird. Let's revisit in a follow-on 
change, if necessary.


Line 1739:     AnalysisError(String.format("create table tab (x int) distribute 
by hash (x) " +
> Well these were never specified in TBLPROPERTIES so I am not sure if we sho
I see. Most of the metadata is in Kudu now. Nice!


http://gerrit.cloudera.org:8080/#/c/4414/6/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

Line 203:   if row_format and file_format == 'text':
> Hm, I couldn't find anything in the docs or in our tests.
There are a few examples here:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL


No idea whether we'll actually do the right thing


http://gerrit.cloudera.org:8080/#/c/4414/7/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 204:         CREATE TABLE {table} (c INT PRIMARY KEY)
> That's interesting. Actually, currently don't work because we can't persist
File JIRA and defer.


Line 213:         TBLPROPERTIES 
('kudu.master_addresses'='{kudu_addr}')""".format(
> Good question. If the user doesn't specify the kudu master_addresses we pop
Thanks for the explanation, seems fine.


Line 226:         DISTRIBUTE BY HASH (c) INTO 3 BUCKETS, RANGE (c) SPLIT ROWS 
(...)
> No, but Kudu's API doesn't currently return any information about range par
Got it, thanks.


-- 
To view, visit http://gerrit.cloudera.org:8080/4414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to