Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16159 )
Change subject: IMPALA-3127: Support incremental metadata updates in partition level ...................................................................... Patch Set 1: (6 comments) Took a first pass and left some comments below. I will spend some more time doing a more detailed review while you address the comments. http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift@424 PS1, Line 424: partiton nit, partition http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift@425 PS1, Line 425: // updates incrementally. THdfsPartitions in the list are in the minimal mode which : // only consists of partition id and partition name. : 14: optional list<THdfsPartition> partition_list Is there any value of having a new field? Seems like this list is always derived from the partitionMap and contains all the partition ids. Can we reuse the partitions map for this purpose? http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@632 PS1, Line 632: " " I feel having space in the catalogObjectKey is bit unconventional and can cause parsing and logging issues. Were there any concerns with using ':' which seems to be used as a delimiter already to distinguish the object type or '/' which is used to separate partition keys? http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@710 PS1, Line 710: if (obj.type != HDFS_PARTITION || delete) { : minimalObject = getMinimalObjectForV2(obj); : } You may want to consider sending the updates for partitions as well since they will help invalidate the partitions in catalog-v2 proactively instead of relying on the default aging out of Partitions in guava cache. http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1614 PS1, Line 1614: toThriftWithoutPartitions perhaps a better name could be toThriftWithMinimalPartitions since we are actually adding some info from partitions. Also, a java doc will be helpful. http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@469 PS1, Line 469: newTable instanceof HdfsTable && thriftTable.getHdfs_table().partitions.isEmpty() : && CollectionUtils.isNotEmpty(thriftTable.getHdfs_table().getPartition_list() I think it would be more readable if we move this to a method called isFromIncrementalUpdate(newTable), or some better name. Also, I feel thriftTable.getHdfs_table().partitions and thriftTable.getHdfs_table().getPartition_list are very similar. Perhaps may be we can rename the list to something more recognizable. -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]> Gerrit-Comment-Date: Tue, 14 Jul 2020 23:13:34 +0000 Gerrit-HasComments: Yes
