Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
......................................................................


Patch Set 3:

(13 comments)

Sorry for the delay in the review. I took another pass and overall the approach 
makes sense to me. I think it would be useful if we could have an exhaustive 
test (may be in a separate jira) to make sure that we are not leaking 
partitions in statestore. The test could add/drop partitions along with 
multiple add/invalidate/drop table commands and make sure that the number of 
partition keys in the statestore is as per our expectation.

http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@712
PS3, Line 712:           if (!delete && obj.type == HDFS_PARTITION) return;
Its unclear to me that when we generate the minimalObject when delete flag is 
false, we set the prev_id in the partObject.setId on line 746. However, that 
object is never sent as per this line. How do we invalidate the updated 
partitions when topicMode is minimal?

Also, would be good to add a comment here (or somewhere else if more 
appropriate) explaining what we send in case of partitions in both v1 and v2 
modes since it is not trivial to understand the subtle differences.

For instance, as I understand:
1. In v1 mode (topicMode = full), we only send the partitionIds in the thrift 
table which represents the current list of the partitions. Additionally, for 
each newly added/removed partition we send a THdfsPartition in the same topic 
update. However, coordinators detect the removal of any partitions by absence 
of a id partitionIds in the table object.
2. In v2 mode (topicMode = minimal), local coordinators only load what they 
need and hence we only send deleted partitionIds. Updated partitions are also 
treated as a special case of deleted partitions by sending the previous 
partitionId for such partitions so that local coordinators invalidate them 
proactively.


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@742
PS3, Line 742:         if (isDelete) {
             :           partObject.setId(obj.hdfs_partition.id);
             :         } else if (obj.hdfs_partition.prev_id != 
HdfsPartition.INITIAL_PARTITION_ID - 1) {
             :           // For updates, coordinators can invalidate the old 
partition instance.
             :           partObject.setId(obj.hdfs_partition.prev_id);
             :         }
This looks a bit hacky to me. Do you think it would be more readable by adding 
a explicit field to THdfsPartition called prev_id or a boolean indicating that 
the id is previous id and let local coordinator decide how to process the 
partition. This implementation on the catalogd side assumes the knowledge of 
how local catalog invalidates stuff and I think it would cleaner to keep them 
independent as much as possible.

If you decide to have prev_id field in the THdfsPartition you can keep a 
default value of field to -1 so that its not serialized unless its set.


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1295
PS3, Line 1295:           catalogTbl.setTable(((HdfsTable) 
tbl).toThriftWithPartitionIds());
wouldn't this line be called for both fullUpdate and a incremental update? The 
javadoc of the method toThriftWithPartitionIds suggests this is called for 
incremental updates which seems confusing.


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1321
PS3, Line 1321:       if (maxSentId < catalogPart.getHdfs_partition().getId()) {
              :         maxSentId = catalogPart.getHdfs_partition().getId();
              :       }
nit, perhaps this is more readable?
maxSentId = Math.max(maxSentId, catalogPart.getHdfs_partition().getId());


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1329
PS3, Line 1329: f (!ctx.isFullUpdate())
Can you clarify why this is needed only in case of incremental updates? What 
happens on the coordinators when a statestore is restarted and then it requests 
for a full-update?


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@90
PS3, Line 90: HdfsPartition
I think it is worth documenting that even though this extends 
CatalogObjectImpl, we don't explicitly have a different catalog version for 
Partitions. All the partitions of a HdfsTable will have the same catalogVersion 
as its parent table.


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@273
PS3, Line 273: cached
nit, replace with "known" instead of "cached" since local coordinators don't 
necessarily cache these partitions.


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@278
PS3, Line 278: HdfsPartition
nit, Do we need to store the full HdfsPartition if it is dropped? Seems like we 
can only store a minimal HdfsPartition.


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@208
PS3, Line 208:     Map<TableName, List<THdfsPartition>> newPartitionsByTable = 
new HashMap<>();
Instead of tracking the partitions separately in this map do you think it would 
be more readable to modify the ObjectSequencer to order the updates such that 
the order is topLevelObject, Table, HdfsPartition (using a PriorityQueue)? The 
advantage of doing that would be that we will have separate methods for 
addPartition, removePartition which fits nicely with the existing logic instead 
of sending over this map to all the table methods.


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@466
PS3, Line 466: Detected by
             :     // non-empty partition list and empty partition map.
can you please update this comment?


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@492
PS3, Line 492: !tHdfsTable.partitions.containsKey(tPart.id)
Why is this not a Preconditions.checkState?


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1378
PS3, Line 1378: useless
nit, may be replace with "obsolete"?


http://gerrit.cloudera.org:8080/#/c/16159/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1451
PS3, Line 1451: partition(id=%d) of table %s.%s
nit, may be more helpful to print the partitionNames here instead of id? Also, 
may be reduce the verbosity the string using perhaps 
dbname.tblname.partitionName format?



--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Anurag Mantripragada <anu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Jul 2020 17:18:29 +0000
Gerrit-HasComments: Yes

Reply via email to