Alex Behm has posted comments on this change.

Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables

Patch Set 7:


Responses to comments. Doing another round now.
File be/src/service/

Line 62:     "value should be a comma separated list of hostnames or IP 
> Ports are optional.
Add that detail to the description.
File common/thrift/CatalogObjects.thrift:

Line 358:   1: required list<TRangeLiteral> values
> Hm, TExpr? That sounds a bit too much for simple literal values unless we e
Why add a new special purpose structure that is redundant with TExpr? To me a 
TExpr is pretty clear, but without looking at the definition, I don't really 
know what a TRangeLiteral is.
File fe/src/main/java/org/apache/impala/analysis/

Line 105:     for (String colName: colNames_) {
> Not sure I follow. Are we doing something to prevent this?
I was wondering whether we should allow something like DISTRIBUTED BY(a, a, a, 
a, a) or whether we should reject that because 'a' is mentioned multiple times. 
We reject such cases for PKs.
File fe/src/main/java/org/apache/impala/service/

Line 231:     } catch (Exception e) {
> I don't think we can differentiate these two given the way the Kudu client 
Got it, thanks!
File fe/src/test/java/org/apache/impala/analysis/

Line 1350:     // CTAS in an external Kudu table
> Hm, what does it really mean to allow CTAS on external Kudu table? We have 
I was thinking it would mean the same thing it does for HDFS tables. After the 
CTAS we have a new external table with a schema according to the query 
statement. The fact that it's a Kudu and not an HDFS table seems like an 
insignificant detail from a user's point of view. In other words, at a 
high-level it's not very clear to me why Kudu would behave any different than 

I get that the implementation may be weird. Let's revisit in a follow-on 
change, if necessary.

Line 1739:     AnalysisError(String.format("create table tab (x int) distribute 
by hash (x) " +
> Well these were never specified in TBLPROPERTIES so I am not sure if we sho
I see. Most of the metadata is in Kudu now. Nice!
File testdata/bin/

Line 203:   if row_format and file_format == 'text':
> Hm, I couldn't find anything in the docs or in our tests.
There are a few examples here:

No idea whether we'll actually do the right thing
File tests/query_test/

Line 204:         CREATE TABLE {table} (c INT PRIMARY KEY)
> That's interesting. Actually, currently don't work because we can't persist
File JIRA and defer.

Line 213:         TBLPROPERTIES 
> Good question. If the user doesn't specify the kudu master_addresses we pop
Thanks for the explanation, seems fine.

> No, but Kudu's API doesn't currently return any information about range par
Got it, thanks.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Michael Brown <>
Gerrit-HasComments: Yes

Reply via email to