[
https://issues.apache.org/jira/browse/HIVE-25286?focusedWorklogId=634979&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-634979
]
ASF GitHub Bot logged work on HIVE-25286:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 06/Aug/21 07:34
Start Date: 06/Aug/21 07:34
Worklog Time Spent: 10m
Work Description: marton-bod commented on a change in pull request #2427:
URL: https://github.com/apache/hive/pull/2427#discussion_r684015292
##########
File path:
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -2377,6 +2392,35 @@ public void testAsOfWithJoins() throws IOException,
InterruptedException {
Assert.assertEquals(8, rows.size());
}
+ @Test
+ public void testStatsRemoved() throws IOException {
+ Assume.assumeTrue("Only HiveCatalog can remove stats which become
obsolete",
+ testTableType == TestTables.TestTableType.HIVE_CATALOG);
+
+ TableIdentifier identifier = TableIdentifier.of("default", "customers");
+
+ shell.setHiveSessionValue(HiveConf.ConfVars.HIVESTATSAUTOGATHER.varname,
true);
+ testTables.createTable(shell, identifier.name(),
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA,
+ PartitionSpec.unpartitioned(), fileFormat, ImmutableList.of());
+
+ String insert =
testTables.getInsertQuery(HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS,
identifier, true);
+ shell.executeStatement(insert);
+
+ checkColStat(identifier.name(), "customer_id", true);
+ checkColStatMinMaxValue(identifier.name(), "customer_id", 0, 2);
+
+ // Create a Catalog where the KEEP_HIVE_STATS is false
+ shell.metastore().hiveConf().set(HiveTableOperations.KEEP_HIVE_STATS,
StatsSetupConst.FALSE);
+ TestTables nonHiveTestTables =
HiveIcebergStorageHandlerTestUtils.testTables(shell, testTableType, temp);
+ Table nonHiveTable = nonHiveTestTables.loadTable(identifier);
+
+ // Append data to the table through a this non-Hive Catalog
Review comment:
It's still Hive Catalog though, isn't it? I thought we are simulating
more like another compute engine, where the keep.stats=false
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 634979)
Time Spent: 40m (was: 0.5h)
> Set stats to inaccurate when an Iceberg table is modified outside Hive
> ----------------------------------------------------------------------
>
> Key: HIVE-25286
> URL: https://issues.apache.org/jira/browse/HIVE-25286
> Project: Hive
> Issue Type: New Feature
> Reporter: Peter Vary
> Assignee: Peter Vary
> Priority: Major
> Labels: pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> When an Iceberg table is modified outside of Hive then the stats should be
> set to inaccurate since there is no way to ensure that the HMS stats are
> updated correctly and this could cause incorrect query results.
> The proposed solution is only working for HiveCatalog
--
This message was sent by Atlassian Jira
(v8.3.4#803005)