zhangbutao commented on code in PR #4984:
URL: https://github.com/apache/hive/pull/4984#discussion_r1470673879


##########
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift:
##########
@@ -725,7 +725,7 @@ struct SetPartitionsStatsRequest {
 2: optional bool needMerge, //stats need to be merged with the existing stats
 3: optional i64 writeId=-1,         // writeId for the current query that 
updates the stats
 4: optional string validWriteIdList, // valid write id list for the table for 
which this struct is being sent
-5: required string engine //engine creating the current request
+5: optional string engine = "hive" //engine creating the current request

Review Comment:
   Yes, you are almost right. To be more specific, for example, Trino will get 
hms table column statistics by its thrift hms client which is compatible with 
hive3.
   
https://github.com/trinodb/trino/blob/057f2d435d6461855ff0f904dbd675b39db97d1e/plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastoreClient.java#L331-L336
   
   But `TableStatsRequest` in Hive4 need pass `engine` through constructor 
function, and the `engine` field is required, therefore:
   
   1. for hive3 thrift client, this api call will fail due to the missed 
`engine `field if HMS server doesn't have this PR, As there is no place for 
hive3 thrift client to pass `engine `field. But this api call will sucessfully 
if HMS has this PR(`engine` field is optional and has the defalut value).
   2. for hive4 thrift client without this PR, you also must pass `engine 
`field in `TableStatsRequest` to call this api correctly no matter if HMS 
server has this PR. Because `engine` field is needed in `TableStatsRequest` 
constructor function.
   3. for hive4 thrift client with this PR. Whether HMS server have this PR or 
not
   , you can call the api sucessfully without pass `engine` as the `engine` 
field is not mandatory in `TableStatsRequest` constructor function, and you can 
explicitly reset the `engine` fileld by `TableStatsRequest::setEngine` if you 
don't want the default `hive engine` column statistics. But i think most  
framework(like trino) will want reuse hive engine column statistics.
   
   Sounds a little verbose.. :) In one word, this PR can make hive3 thrift hms 
client worked well with hive4 hms when calling statistics related api. I have 
tested this PR with Trino, and i think it is good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to