Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 )
Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location ...................................................................... Patch Set 16: (11 comments) http://gerrit.cloudera.org:8080/#/c/10543/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10543/15//COMMIT_MSG@21 PS15, Line 21: For managed tables it's no longer allowed to drop a partition that > For managed tables,... Done http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1227 PS12, Line 1227: partitionSet == nu > Isn't this more like a precondition? Also you could just say Done http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1202 PS15, Line 1202: HashSet<HdfsPartition> partitions = locationToPartMap_.get(location); > Preconditions.checkNotNull(partitions); Done http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2245 PS12, Line 2245: ((HdfsTable)tbl).isLocationSharedWithOtherPartitions(part)) { > Should we check if the table is "managed" / "external" in Hive? I'm guessin Since dropping a partition from an external table doesn't actually drop the physical directory for the partition, I think it's a good idea not to indicate error here for external tables. Thx! There was an weird behaviour I found here: I created a managed table, added some partitions, then created an external table pointing to the managed table, added few more partitions and checked them in the managed table. Apparently, the partitions are added to the table's directory in HDFS but the ones created for the external table are not visible to the managed table even after an invalidate metadata. This is unrelated to this patch but still is weird. http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2247 PS12, Line 2247: t > Could you format the string to add the partition data here. Good point! Do you have a preference how many partition names should be displayed? I picked 5, do you think it's sufficient? http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2244 PS15, Line 2244: !Table.isExternalTable(tb > Remove. This already holds via L2229. Done http://gerrit.cloudera.org:8080/#/c/10543/12/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/12/tests/metadata/test_partition_metadata.py@206 PS12, Line 206: # notice that some partitions have been dropped by Hive. > May be just call refresh instead? Done http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@173 PS15, Line 173: This : test checks > Typo Done http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@174 PS15, Line 174: partition > Remove Done http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@177 PS15, Line 177: pytest.skip() > Reading through this again, I feel that this can be enabled in the current I preferred to disable this test for two reasons: - I didn't want to assert on some condition that actually not the desired outcome of this scenario. - I think if we assert on j=2 still being a valid partition then once HIVE-19830 makes it to Impala it would break the builds and someone probably not familiar to this change has to figure out why it went wrong. I found it a bit safer to skip the test and enable it once the related Hive change is done. What do you think? http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@209 PS15, Line 209: As Hive dropped j=1 partition it is e > Update Done -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 16 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 20 Jun 2018 06:47:24 +0000 Gerrit-HasComments: Yes
