Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10543 )

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
......................................................................


Patch Set 16:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/10543/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10543/15//COMMIT_MSG@21
PS15, Line 21: For managed tables it's no longer allowed to drop a partition 
that
> For managed tables,...
Done


http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1227
PS12, Line 1227: partitionSet == nu
> Isn't this more like a precondition? Also you could just say
Done


http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1202
PS15, Line 1202:     HashSet<HdfsPartition> partitions = 
locationToPartMap_.get(location);
> Preconditions.checkNotNull(partitions);
Done


http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2245
PS12, Line 2245:             
((HdfsTable)tbl).isLocationSharedWithOtherPartitions(part)) {
> Should we check if the table is "managed" / "external" in Hive? I'm guessin
Since dropping a partition from an external table doesn't actually drop the 
physical directory for the partition, I think it's a good idea not to indicate 
error here for external tables.
Thx!

There was an weird behaviour I found here: I created a managed table, added 
some partitions, then created an external table pointing to the managed table, 
added few more partitions and checked them in the managed table. Apparently, 
the partitions are added to the table's directory in HDFS but the ones created 
for the external table are not visible to the managed table even after an 
invalidate metadata. This is unrelated to this patch but still is weird.


http://gerrit.cloudera.org:8080/#/c/10543/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2247
PS12, Line 2247: t
> Could you format the string to add the partition data here.
Good point! Do you have a preference how many partition names should be 
displayed? I picked 5, do you think it's sufficient?


http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10543/15/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2244
PS15, Line 2244: !Table.isExternalTable(tb
> Remove. This already holds via L2229.
Done


http://gerrit.cloudera.org:8080/#/c/10543/12/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10543/12/tests/metadata/test_partition_metadata.py@206
PS12, Line 206:     # notice that some partitions have been dropped by Hive.
> May be just call refresh instead?
Done


http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@173
PS15, Line 173:  This
              :     test checks
> Typo
Done


http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@174
PS15, Line 174: partition
> Remove
Done


http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@177
PS15, Line 177: pytest.skip()
> Reading through this again, I feel that this can be enabled in the current
I preferred to disable this test for two reasons:
- I didn't want to assert on some condition that actually not the desired 
outcome of this scenario.
- I think if we assert on j=2 still being a valid partition then once 
HIVE-19830 makes it to Impala it would break the builds and someone probably 
not familiar to this change has to figure out why it went wrong. I found it a 
bit safer to skip the test and enable it once the related Hive change is done.

What do you think?


http://gerrit.cloudera.org:8080/#/c/10543/15/tests/metadata/test_partition_metadata.py@209
PS15, Line 209: As Hive dropped j=1 partition it is e
> Update
Done



--
To view, visit http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 16
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Wed, 20 Jun 2018 06:47:24 +0000
Gerrit-HasComments: Yes

Reply via email to