JK Pasimuthu created HIVE-27903:
-----------------------------------
Summary: TBLPROPERTIES('history.expire.max-snapshot-age-ms')
doesn't work
Key: HIVE-27903
URL: https://issues.apache.org/jira/browse/HIVE-27903
Project: Hive
Issue Type: Improvement
Components: Hive
Affects Versions: 4.0.0-alpha-2
Reporter: JK Pasimuthu
[https://github.com/apache/iceberg/issues/9123]
The 'history.expire.max-snapshot-age-ms' option doesn't have any effect while
expiring snapshots.
#
CREATE TABLE IF NOT EXISTS test5d78b6 (
id INT, random1 STRING
)
PARTITIONED BY (random2 STRING)
STORED BY ICEBERG
TBLPROPERTIES (
'write.format.default'='orc',
'format-version'='2',
'write.orc.compression-codec'='none'
)
# INSERT INTO test5d78b6 SELECT if(isnull(MAX(id)) ,0 , MAX(id) ) +1, uuid(),
uuid() FROM test5d78b6
# INSERT INTO test5d78b6 SELECT if(isnull(MAX(id)) ,0 , MAX(id) ) +1, uuid(),
uuid() FROM test5d78b6
# SLEEP for 30 seconds
# INSERT INTO test5d78b6 SELECT if(isnull(MAX(id)) ,0 , MAX(id) ) +1, uuid(),
uuid() FROM test5d78b6
# INSERT INTO test5d78b6 SELECT if(isnull(MAX(id)) ,0 , MAX(id) ) +1, uuid(),
uuid() FROM test5d78b6
# SELECT (UNIX_TIMESTAMP(CURRENT_TIMESTAMP) - UNIX_TIMESTAMP('2023-10-09
13:23:54.455')) * 1000;
# ALTER TABLE test5d78b6 SET
tblproperties('history.expire.max-snapshot-age-ms'='54000'); - the elapsed time
in ms from the second insert and current time
# ALTER TABLE test5d78b6 EXECUTE expire_snapshots('2200-10-10');
# SELECT COUNT FROM default.test5d78b6.snapshots;
output: 1. it should be 2 rows. The default 1 is retained an all snapshots are
expired as usual, so setting the property has no effect.
Additional Info: the default value for 'history.expire.max-snapshot-age-ms' is
5 days per this link: [https://iceberg.apache.org/docs/1.3.1/configuration/]
Now while writing the tests and running them, the expiring snapshots just
worked fine within few seconds of the snapshots being created.
So, I'm assuming that this option doesn't have any effect right now. Having
said that, I'm thinking the implications on end user will have if we fix this.
The end user may not know about this option at all and will have tough time
figuring out why the snapshots are not getting expired. One option could be to
set the default to 0ms.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)