Hello,

Not sure if this is only about other projects that are using
metastore. If we change the storage for timestamps aren't we gonna
break even existing deployments of Hive in the next upgrade?

I am including the user@ list since this is not a pure dev discussion
but can also impact existing users.

Best,
Stamatis

On Wed, Feb 4, 2026 at 11:22 AM Thomas Rebele <[email protected]> wrote:
>
> Hi Hive community,
>
> I'm working on HIVE-29398 to make the Hive metastore more compatible with 
> other projects that use it (e.g., Impala). In 2019, HIVE-22311 (Propagate 
> min/max column values from statistics to the optimizer for timestamp type) 
> had introduced a struct TimestampColumnStatsData in the thrift definition. It 
> seems that this change to the thrift code was not necessary, as the timestamp 
> statistics can be passed via the existing LongColumnStatsData as well. Impala 
> actually expects the statistics that way. I had worked on a property to 
> switch back to the old behavior.
>
> In the review of the PR, Krisztian Kasa suggested to ask the community, 
> whether it would be possible to undo the change of HIVE-22311. A while ago I 
> prepared a patch to undo the changes to the thrift code, while still keeping 
> the benefits of propagating the stats to the optimizer, so it is possible. 
> I'm quite new to Hive, so I don't know much about the consequences of 
> removing a field from the thrift code. Is it actually advisable to remove the 
> timestampStats field from hive_metastore.thrift? Are there other projects 
> that started to use the timestamp stats field? In the case we decide to drop 
> the field, those projects would need to use the long field instead.
>
> Best regards,
> Thomas Rebele

Reply via email to