Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19391 )

Change subject: IMPALA-11812: Deduplicate column schema in hmsPartitions
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1862
PS6, Line 1862: fetchPartitionsByName
> nit. This method probably should be renamed to fetchPartitionsByTable() due
I think the name is ok since the 'ByName' refers to the names of the partitions 
rather than the table. There's a corresponding underlying HMS api called 
get_partitions_by_name() so this is a higher level wrapper for that.


http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@208
PS6, Line 208: LOG.info("Fetched {}/{} partitions for table {}.{}", numDone, 
partNames.size(),
             :           msTbl.getDbName(), msTbl.getTableName());
> This probably will log <maxPartitionsPerRpc_> number of progressive message
Actually, the number of logged messages would be 
partNames.size()/maxPartitionsPerRpc_ .  If there are 5000 partitions to fetch 
and the batch size per RPC is 1000 partitions, then there will be 5 such 
messages. If the batch size is too small, yes it will generate more messages 
but in general it is not a bad idea to track the progress.  I am ok either way 
but will let Quanlong respond.


http://gerrit.cloudera.org:8080/#/c/19391/6/tests/custom_cluster/test_wide_table_operations.py
File tests/custom_cluster/test_wide_table_operations.py:

http://gerrit.cloudera.org:8080/#/c/19391/6/tests/custom_cluster/test_wide_table_operations.py@36
PS6, Line 36:     if cls.exploration_strategy() != 'exhaustive':
This is a good test case and given the high runtime it makes sense to restrict 
this to exhaustive run only. I am wondering if there is any intermediate 
message that should be displayed to the console ..e.g after each of the 
recover_stmt, invalidate_stmt, show_parts_stmt execute, a success message could 
be shown.  Also, how long does table creation itself take ? Developers may want 
to know if things are stuck or progressing. However, this is a nice to have, 
not must do.



--
To view, visit http://gerrit.cloudera.org:8080/19391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I511ecca0ace8bea4c24a19a54fb0a75390e50c4d
Gerrit-Change-Number: 19391
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Thu, 29 Dec 2022 20:39:11 +0000
Gerrit-HasComments: Yes

Reply via email to