zhangbutao commented on code in PR #4984:
URL: https://github.com/apache/hive/pull/4984#discussion_r1470673879
##########
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift:
##########
@@ -725,7 +725,7 @@ struct SetPartitionsStatsRequest {
2: optional bool needMerge, //stats need to be merged with the existing stats
3: optional i64 writeId=-1, // writeId for the current query that
updates the stats
4: optional string validWriteIdList, // valid write id list for the table for
which this struct is being sent
-5: required string engine //engine creating the current request
+5: optional string engine = "hive" //engine creating the current request
Review Comment:
Yes, you are almost right. To be more specific, for example, Trino will get
hms table column statistics by its thrift hms client which is compatible with
hive3.
https://github.com/trinodb/trino/blob/057f2d435d6461855ff0f904dbd675b39db97d1e/plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastoreClient.java#L331-L336
But `TableStatsRequest` in Hive4 need pass `engine` through constructor
function, and the `engine` field is required, therefore:
1. for hive3 thrift client, this api call will fail due to the missed
`engine `field if HMS server doesn't have this PR, As there is no place for
hive3 thrift client to pass `engine `field. But this api call will sucessfully
if HMS has this PR(`engine` field is optional and has the defalut value).
2. for hive4 thrift client without this PR, you also must pass `engine
`field in `TableStatsRequest` to call this api correctly no matter if HMS
server has this PR. Because `engine` field is needed in `TableStatsRequest`
constructor function.
3. for hive4 thrift client with this PR. Whether HMS server have this PR or
not
, you can call the api sucessfully without pass `engine` as the `engine`
field is not mandatory in `TableStatsRequest` constructor function, and you can
explicitly reset the `engine` fileld by `TableStatsRequest::setEngine` if you
don't want the default `hive engine` column statistics. But i think most
framework(like trino) will want reuse hive engine column statistics.
Sounds a little verbose.. :) In one word, this PR can make hive3 thrift hms
client worked well with hive4 hms when calling statistics related api. I have
tested this PR with Trino, and i think it is good.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]