Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10584 )

Change subject: IMPALA-7077. Add a configuration for the max number of 
partitions to load
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG@19
PS1, Line 19: partitions will be created
> You mean those are created on the filesystem but not in HMS or something el
right, will clarify in the commit message.


http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG@19
PS1, Line 19: will no longer load
            : following the change
> will that table be purged or retained in catalogd in some sort of unusable
Will clarify this


http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG@21
PS1, Line 21: worth the complexity to try to
            : somehow limit the addition of partitions during a DML
> I guess this is a simple check in CatalogOpEx#updateCatalog() unless I'm mi
The problem is that the catalog update happens at the end of the DML, after the 
insert sinks have already created all of the directories and files on the 
filesystem, if I understand the flow correctly. So, unless we wanted to try to 
back out the changes by rm -rfing the files that were created (seems difficult 
and error-prone) we're kind of stuck with the following choices:

1) leave the files on the filesystem but don't create the partitions in HMS
2) create the partitions in HMS but leave the table in a state where Impala 
will refuse to load it (but at least Hive will load it).

Neither is great but I couldn't think of a good alternative.


http://gerrit.cloudera.org:8080/#/c/10584/1/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/10584/1/common/thrift/CatalogService.thrift@68
PS1, Line 68:   7: optional list<string> warning_messages
> Where is this used? Don't see it's references in the patch.
ah, I originally attempted to use this to send warnings back on DML, but then 
reverted this part of the change. forgot to remove the thrift, sorry


http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@287
PS1, Line 287:   public static int MAX_PARTITIONS_PER_TABLE =
> final?
it's modified by a test so can't make it final. I can add a comment to that 
effect if you want.


http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2067
PS1, Line 2067:     if (curPartitionCount + allHmsPartitionsToAdd.size() > 
HdfsTable.MAX_PARTITIONS_PER_TABLE) {
> nit: line overflow (here and below)
these are < 100 chars which is the line limit that gerrit is showing. Do you 
try to keep to 80 chars in Java code?



--
To view, visit http://gerrit.cloudera.org:8080/10584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife99a97a891ed14675303ea472abb2932a72cb51
Gerrit-Change-Number: 10584
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Tue, 05 Jun 2018 18:00:11 +0000
Gerrit-HasComments: Yes

Reply via email to