Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 )
Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location ...................................................................... Patch Set 20: (11 comments) http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1435 PS18, Line 1435: / Only reload file metadata of partitions speci > This loop constructs partitionsToUpdateFileMdByPath for the case where part In my opinion the insert path is the performance critical rather than the delete partition path. So, it might make sense to get rid of locationToPartMap_ then and use this loop to get all the partitions that share its location with the input partitions. In turn the drop partition path would have some increased time complexity: <num of partitions to delete> * <num of total partitions>. Or I could so something like here with my latest change: have a set of locations related to the input partitions and then the complexity reduces to <num of total partitions>. http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2252 PS18, Line 2252: opOptions.purgeData(purge); > if the path -> partition map is dropped, an additional loop would need to c see my comments on your other question. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@124 PS18, Line 124: # Create the table : self.execute_query_expect_success( : self.client, > This looks wrong, because the directories made here will only be "same_loc_ Good point. No need to create these directories. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@127 PS18, Line 127: table % > Here and elsewhere, there's a execute_query_expect_success() method you sho Done http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@163 PS18, Line 163: chec > Are we (Impala) in control of this message? It should be "its". Done http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@168 PS18, Line 168: # Drop partition j=2 and expect this operation to fail as the partition's location is > If we are here, then we never raised an exception. It would be polite to sh Done http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@177 PS18, Line 177: assert expe > You can insert a reason here. It's mildly helpful to someone that happens t Done http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@184 PS18, Line 184: drops one of these partitions and as a result both are expected to be dropped. This : test checks that both of the mentioned partitions are no longer > As before, I can't tell that this is needed. Apparently, this is not needed either. http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@187 PS18, Line 187: A-7152 > As before, execute_query_expect_success() in places where you expect succes Done http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@191 PS18, Line 191: TBL_NAME = "same_loc_test" : FQ_TBL_NAME = unique_database + "." + TBL_NAME > Same comment for L185 applies here. Done http://gerrit.cloudera.org:8080/#/c/10543/18/tests/metadata/test_partition_metadata.py@202 PS18, Line 202: self.execute_query_expect_success( : self.client, "alter table > You can use self.hive_client() instead. Done -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 20 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 26 Jun 2018 18:02:46 +0000 Gerrit-HasComments: Yes
