Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16159 )
Change subject: IMPALA-3127: Support incremental metadata updates in partition level ...................................................................... Patch Set 5: (6 comments) Thank Vihang! Addressed the comments. http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift@292 PS4, Line 292: list<Exprs.TExpr> > curious to know if it is intentional that this field is not optional? Also, Oops, this is optional too. Done. http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@748 PS4, Line 748: > Is the delete flag really needed in this method? For instance, do you think I'm ok to both the current solution and the one you insist on. Moved the delete flag checking to the coordinator side. http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@750 PS4, Line 750: break; : case CATALOG: : // Sending th > not sure if you can fully avoid this issue anyways. For instance, if a part Good point! This is not handled correctly. We can trace the prev_id "link" to an id that's not in the droppedPartitions set of the table. I tend to leave this as an optimization because there may be other corner cases to cover (e.g. when the table instance is replaced so lossing its droppedPartitions set), and coordinators can work with the current solution in many cases. Will create a JIRA for this. http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@93 PS4, Line 93: The catalog versions are : * not used actually. > Should we Override the getCatalogVersion and setCatalogVersion methods to t Yeah! Done. http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@399 PS4, Line 399: // Ignore partition deletions in catalog-v1 > may be add more specific message here: Done http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py@187 PS4, Line 187: id=%%d > Do we have to verify the partition ids? Shouldn't the message with the > partition name be sufficient? I think it's not bad to verify the partition ids as well. We can extend this test to run a concurrent query to load the latest partitiion instances. Then we need to make sure only the old (stale) instances are invalidated. > Also, another way to confirm invalidation is be making sure the the next > query is loading the partitions again by confirming if the metric counters > for the cache hit/misses in the profile are incremented correctly. Yeah, but I prefer this way because the invalidations are the results of the INSERT statement. By this we can be 100% sure that the invalidations are triggered by the catalog topic updates, not other mechanisms like cache TTL, RSU eviction. -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]> Gerrit-Comment-Date: Mon, 27 Jul 2020 09:53:15 +0000 Gerrit-HasComments: Yes
