Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
......................................................................


Patch Set 4:

(6 comments)

Thanks for the updated patch. The patch looks mostly good from my side except 
for few minor suggestions below.

http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift@292
PS4, Line 292: list<Exprs.TExpr>
curious to know if it is intentional that this field is not optional? Also, 
adding a comment here to say that fields are optional because THdfsPartition 
are minimally filled as required when topic mode is minimal.


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@748
PS4, Line 748: setId
Is the delete flag really needed in this method? For instance, do you think its 
cleaner if we simply set the following here irrespective of the delete flag 
value:

partObject.setId(obj.hdfs_partition.id);
partObject.setPrevId(obj.hdfs_partition.prev_id);

and on the local coordinator side, if the HdfsPartition topic entry has delete 
flag = true we invalidate the the HdfsPartition using the id, else we 
invalidate the HdfsPartition using prev_id; I understand in that case the 
invalidatePartition method in the CatalogMetaProvider will have to take in the 
delete flag but that way we are not leaking implementation details from local 
coordinator to catalog side.


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@750
PS4, Line 750:           // This is an update on a new partition (!isDelete && 
prev_id == -1).
             :           // Nothing to invalidate.
             :           return null
not sure if you can fully avoid this issue anyways. For instance, if a 
partition is updated twice between topic updates, the second instance of the 
partition will have the prev_id set.


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@93
PS4, Line 93: The catalog versions are
            :  * not used actually.
Should we Override the getCatalogVersion and setCatalogVersion methods to throw 
UnsupportedOperationException instead so that they are not accidently used in 
future?


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@399
PS4, Line 399: // Ignore partition deletions in catalog-v1.
may be add more specific message here:
// we ignore partition deletions in catalog-v1 mode since they are handled 
during table updates to atomically update the table.


http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py@187
PS4, Line 187: id=%%d
Do we have to verify the partition ids? Shouldn't the message with the 
partition name be sufficient? Also, another way to confirm invalidation is be 
making sure the the next query is loading the partitions again by confirming if 
the metric counters for the cache hit/misses in the profile are incremented 
correctly.



--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anurag Mantripragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Thu, 23 Jul 2020 19:05:35 +0000
Gerrit-HasComments: Yes

Reply via email to