Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 )
Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location ...................................................................... Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1485 PS2, Line 1485: if (partitions == null) { > I'm not an expert on the Catalog code, so feel free to push back. - I have seen tables with > 100k partitions. If we take this route, we'd need to loop through them in every "alter" request. I think that'd be expensive, but I could be wrong (add partition is perf critical operation AFAICT) - Also, like you mentioned, this adds some extra memory, especially the String keys (the partition objects are just references) so we need to probably intern them. - Either way, we can probably benchmark both the approaches on large partitioned tables and see which one is better. Gabor, any thoughts on this? http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py@159 PS1, Line 159: assert data.split('\t') == ['21', '6'] > Nice catch as dropping a partition in this case would cause some issue. Not I see. Did you get a chance to see how Apache Hive behaves in this case? -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Thu, 31 May 2018 20:32:08 +0000 Gerrit-HasComments: Yes
