Rajesh Balamohan created HIVE-24367: ---------------------------------------
Summary: Explore whether HiveAlterHandler::alterTable can be optimised for non-partitioned tablesInbox Key: HIVE-24367 URL: https://issues.apache.org/jira/browse/HIVE-24367 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Rajesh Balamohan {color:#222222}Writing lots of delta in non-partitioned table creates runtime issues, when lot of delta folders are present.{color} {color:#222222} {color} {color:#222222}Following code in HiveAlterHandler is invoked for every insert operation. It computes {{{color} {color:#222222}updateTableStatsSlow}} for every insert causing runtime delays.{color} {color:#222222} {color} {noformat} if (MetaStoreUtils.requireCalStats(null, null, newt, environmentContext) && !isPartitionedTable) { Database db = msdb.getDatabase(catName, newDbName); assert(isReplicated == HiveMetaStore.HMSHandler.isDbReplicationTarget(db)); // Update table stats. For partitioned table, we update stats in alterPartition() MetaStoreUtils.updateTableStatsSlow(db, newt, wh, false, true, environmentContext); } {noformat} {color:#222222}It would be good to explore whether only the newly added delta can be listed for computing stats. This would avoid huge listing call during stats collection.{color} {color:#222222}e.g queries to repro{color} {noformat} CREATE TABLE IF NOT EXISTS test (name String, value int); INSERT INTO test VALUES('K1',1); INSERT INTO test VALUES('K2',2); .. .. .. INSERT INTO test VALUES('K20000',2) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)