Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10543 )

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1485
PS2, Line 1485:       if (partitions == null) {
> I'm not an expert on the Catalog code, so feel free to push back.
- I have seen tables with > 100k partitions. If we take this route, we'd need 
to loop through them in every "alter" request. I think that'd be expensive, but 
I could be wrong (add partition is perf critical operation AFAICT)

- Also, like you mentioned, this adds some extra memory, especially the String 
keys (the partition objects are just references) so we need to probably intern 
them.

- Either way, we can probably benchmark both the approaches on large 
partitioned tables and see which one is better.

Gabor, any thoughts on this?


http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py@159
PS1, Line 159:       assert data.split('\t') == ['21', '6']
> Nice catch as dropping a partition in this case would cause some issue. Not
I see. Did you get a chance to see how Apache Hive behaves in this case?



--
To view, visit http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 31 May 2018 20:32:08 +0000
Gerrit-HasComments: Yes

Reply via email to