Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 )
Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location ...................................................................... Patch Set 1: (13 comments) http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@189 PS1, Line 189: in String as nit: remove, kinda obvious http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@189 PS1, Line 189: :: nit:Java style http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@191 PS1, Line 191: private final HashMap<String, HashSet<HdfsPartition>> locationToPartMap_ = Unfortunate that a lot of state maintained in this class and is pretty confusing and can lead to subtle bugs if we forget to update/remove from one of these data structures. We need to clean these up eventually. http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1105 PS1, Line 1105: partition.getLocation() can be inferred from the partition object? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1204 PS1, Line 1204: :: nit:Java http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1205 PS1, Line 1205: maintains updates http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1207 PS1, Line 1207: chang nit: update? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1214 PS1, Line 1214: Adds 'location' as key and 'partition' as value to locationToPartMap_. May be just say "Adds 'partition' to locationTo..." Rest is implicit. http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1216 PS1, Line 1216: insertToLocationPartMap call it addToLocationPartMap may be? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1468 PS1, Line 1468: a Map that contains all the HdfsPartitions : * that point to the same locations as the ones received I think the earlier description is more clear. http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2434 PS1, Line 2434: change nit: call it updatePartitionLocation()? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2435 PS1, Line 2435: } else { : partition.setLocation(location); : } I don't think non HdfsTable is possible here. Look at catalog_.getHdfsPartition() http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py@159 PS1, Line 159: assert data.split('\t') == ['21', '6'] Could you drop one of the partitions that point to the same partition dir and make sure the other partition still counts it? -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 30 May 2018 22:10:58 +0000 Gerrit-HasComments: Yes
