Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10584 )
Change subject: IMPALA-7077. Add a configuration for the max number of partitions to load ...................................................................... Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG@19 PS1, Line 19: partitions will be created > You mean those are created on the filesystem but not in HMS or something el right, will clarify in the commit message. http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG@19 PS1, Line 19: will no longer load : following the change > will that table be purged or retained in catalogd in some sort of unusable Will clarify this http://gerrit.cloudera.org:8080/#/c/10584/1//COMMIT_MSG@21 PS1, Line 21: worth the complexity to try to : somehow limit the addition of partitions during a DML > I guess this is a simple check in CatalogOpEx#updateCatalog() unless I'm mi The problem is that the catalog update happens at the end of the DML, after the insert sinks have already created all of the directories and files on the filesystem, if I understand the flow correctly. So, unless we wanted to try to back out the changes by rm -rfing the files that were created (seems difficult and error-prone) we're kind of stuck with the following choices: 1) leave the files on the filesystem but don't create the partitions in HMS 2) create the partitions in HMS but leave the table in a state where Impala will refuse to load it (but at least Hive will load it). Neither is great but I couldn't think of a good alternative. http://gerrit.cloudera.org:8080/#/c/10584/1/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/10584/1/common/thrift/CatalogService.thrift@68 PS1, Line 68: 7: optional list<string> warning_messages > Where is this used? Don't see it's references in the patch. ah, I originally attempted to use this to send warnings back on DML, but then reverted this part of the change. forgot to remove the thrift, sorry http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@287 PS1, Line 287: public static int MAX_PARTITIONS_PER_TABLE = > final? it's modified by a test so can't make it final. I can add a comment to that effect if you want. http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10584/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2067 PS1, Line 2067: if (curPartitionCount + allHmsPartitionsToAdd.size() > HdfsTable.MAX_PARTITIONS_PER_TABLE) { > nit: line overflow (here and below) these are < 100 chars which is the line limit that gerrit is showing. Do you try to keep to 80 chars in Java code? -- To view, visit http://gerrit.cloudera.org:8080/10584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife99a97a891ed14675303ea472abb2932a72cb51 Gerrit-Change-Number: 10584 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Tue, 05 Jun 2018 18:00:11 +0000 Gerrit-HasComments: Yes
