Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
......................................................................


Patch Set 1:

(6 comments)

Took a first pass and left some comments below. I will spend some more time 
doing a more detailed review while you address the comments.

http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift@424
PS1, Line 424: partiton
nit, partition


http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift@425
PS1, Line 425:   // updates incrementally. THdfsPartitions in the list are in 
the minimal mode which
             :   // only consists of partition id and partition name.
             :   14: optional list<THdfsPartition> partition_list
Is there any value of having a new field? Seems like this list is always 
derived from the partitionMap and contains all the partition ids. Can we reuse 
the partitions map for this purpose?


http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@632
PS1, Line 632: " "
I feel having space in the catalogObjectKey is bit unconventional and can cause 
parsing and logging issues. Were there any concerns with using ':' which seems 
to be used as a delimiter already to distinguish the object type or '/' which 
is used to separate partition keys?


http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@710
PS1, Line 710:         if (obj.type != HDFS_PARTITION || delete) {
             :           minimalObject = getMinimalObjectForV2(obj);
             :         }
You may want to consider sending the updates for partitions as well since they 
will help invalidate the partitions in catalog-v2 proactively instead of 
relying on the default aging out of Partitions in guava cache.


http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1614
PS1, Line 1614: toThriftWithoutPartitions
perhaps a better name could be toThriftWithMinimalPartitions since we are 
actually adding some info from partitions. Also, a java doc will be helpful.


http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@469
PS1, Line 469: newTable instanceof HdfsTable && 
thriftTable.getHdfs_table().partitions.isEmpty()
             :         && 
CollectionUtils.isNotEmpty(thriftTable.getHdfs_table().getPartition_list()
I think it would be more readable if we move this to a method called 
isFromIncrementalUpdate(newTable), or some better name.

Also, I feel thriftTable.getHdfs_table().partitions and 
thriftTable.getHdfs_table().getPartition_list are very similar. Perhaps may be 
we can rename the list to something more recognizable.



--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anurag Mantripragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Tue, 14 Jul 2020 23:13:34 +0000
Gerrit-HasComments: Yes

Reply via email to