Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16159 )
Change subject: IMPALA-3127: Support incremental metadata updates in partition level ...................................................................... Patch Set 4: (6 comments) Thanks for the updated patch. The patch looks mostly good from my side except for few minor suggestions below. http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift@292 PS4, Line 292: list<Exprs.TExpr> curious to know if it is intentional that this field is not optional? Also, adding a comment here to say that fields are optional because THdfsPartition are minimally filled as required when topic mode is minimal. http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@748 PS4, Line 748: setId Is the delete flag really needed in this method? For instance, do you think its cleaner if we simply set the following here irrespective of the delete flag value: partObject.setId(obj.hdfs_partition.id); partObject.setPrevId(obj.hdfs_partition.prev_id); and on the local coordinator side, if the HdfsPartition topic entry has delete flag = true we invalidate the the HdfsPartition using the id, else we invalidate the HdfsPartition using prev_id; I understand in that case the invalidatePartition method in the CatalogMetaProvider will have to take in the delete flag but that way we are not leaking implementation details from local coordinator to catalog side. http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@750 PS4, Line 750: // This is an update on a new partition (!isDelete && prev_id == -1). : // Nothing to invalidate. : return null not sure if you can fully avoid this issue anyways. For instance, if a partition is updated twice between topic updates, the second instance of the partition will have the prev_id set. http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@93 PS4, Line 93: The catalog versions are : * not used actually. Should we Override the getCatalogVersion and setCatalogVersion methods to throw UnsupportedOperationException instead so that they are not accidently used in future? http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@399 PS4, Line 399: // Ignore partition deletions in catalog-v1. may be add more specific message here: // we ignore partition deletions in catalog-v1 mode since they are handled during table updates to atomically update the table. http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py@187 PS4, Line 187: id=%%d Do we have to verify the partition ids? Shouldn't the message with the partition name be sufficient? Also, another way to confirm invalidation is be making sure the the next query is loading the partitions again by confirming if the metric counters for the cache hit/misses in the profile are incremented correctly. -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]> Gerrit-Comment-Date: Thu, 23 Jul 2020 19:05:35 +0000 Gerrit-HasComments: Yes
