Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
......................................................................


Patch Set 5:

(6 comments)

Thank Vihang! Addressed the comments.

http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/16159/4/common/thrift/CatalogObjects.thrift@292
PS4, Line 292: list<Exprs.TExpr>
> curious to know if it is intentional that this field is not optional? Also,
Oops, this is optional too. Done.


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@748
PS4, Line 748:
> Is the delete flag really needed in this method? For instance, do you think
I'm ok to both the current solution and the one you insist on. Moved the delete 
flag checking to the coordinator side.


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@750
PS4, Line 750:         break;
             :       case CATALOG:
             :         // Sending th
> not sure if you can fully avoid this issue anyways. For instance, if a part
Good point! This is not handled correctly. We can trace the prev_id "link" to 
an id that's not in the droppedPartitions set of the table. I tend to leave 
this as an optimization because there may be other corner cases to cover (e.g. 
when the table instance is replaced so lossing its droppedPartitions set), and 
coordinators can work with the current solution in many cases. Will create a 
JIRA for this.


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@93
PS4, Line 93: The catalog versions are
            :  * not used actually.
> Should we Override the getCatalogVersion and setCatalogVersion methods to t
Yeah! Done.


http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/4/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@399
PS4, Line 399: // Ignore partition deletions in catalog-v1
> may be add more specific message here:
Done


http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/16159/4/tests/custom_cluster/test_local_catalog.py@187
PS4, Line 187: id=%%d
> Do we have to verify the partition ids? Shouldn't the message with the 
> partition name be sufficient?

I think it's not bad to verify the partition ids as well. We can extend this 
test to run a concurrent query to load the latest partitiion instances. Then we 
need to make sure only the old (stale) instances are invalidated.

> Also, another way to confirm invalidation is be making sure the the next 
> query is loading the partitions again by confirming if the metric counters 
> for the cache hit/misses in the profile are incremented correctly.

Yeah, but I prefer this way because the invalidations are the results of the 
INSERT statement. By this we can be 100% sure that the invalidations are 
triggered by the catalog topic updates, not other mechanisms like cache TTL, 
RSU eviction.



--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anurag Mantripragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Mon, 27 Jul 2020 09:53:15 +0000
Gerrit-HasComments: Yes

Reply via email to