[ 
https://issues.apache.org/jira/browse/HIVE-27964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799522#comment-17799522
 ] 

Naresh P R edited comment on HIVE-27964 at 12/21/23 5:47 PM:
-------------------------------------------------------------

Partition table rename gets clogged at PART_COL_STATS for wide tables.
{code:java}
CREATE TABLE IF NOT EXISTS `PART_COL_STATS` (
 ...
 `DB_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
 `TABLE_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
 `PARTITION_NAME` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT 
NULL, 
  ...){code}
Since PART_COL_STATS holds db_name & table_name, incase of table rename, every 
row in PART_COL_STATS associated with the table should be fetched, stored in 
memory, delete & re-insert with new db/table/partition name.

Instead clearing the stats before rename & computing later would help to speed 
up the process.

 

Another optimization i was about to raise is to remove DB_NAME, TABLE_NAME, 
PARTITION_NAME from PART_COL_STATS & use PART_ID as FOREIGN KEY from PARTITIONS 
to avoid touching PART_COL_STATS for table/partition renames.


was (Author: nareshpr):
Partition table rename gets clogged at PART_COL_STATS for wide tables.
{code:java}
CREATE TABLE IF NOT EXISTS `PART_COL_STATS` (
 ...
 `DB_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
 `TABLE_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
 `PARTITION_NAME` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT 
NULL, 
  ...){code}
Since PART_COL_STATS holds db_name & table_name, incase of table rename, every 
row in PART_COL_STATS associated with the table should be fetched, stored in 
memory, dropped & re-added with new tableName.

Instead clearing the stats before rename & computing later would help to speed 
up the process.

 

Another optimization i was about to raise is to remove DB_NAME, TABLE_NAME, 
PARTITION_NAME from PART_COL_STATS & use PART_ID as FOREIGN KEY from PARTITIONS 
to avoid touching PART_COL_STATS for table/partition renames.

> Support drop stats similar to Impala
> ------------------------------------
>
>                 Key: HIVE-27964
>                 URL: https://issues.apache.org/jira/browse/HIVE-27964
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Naresh P R
>            Priority: Major
>
> Hive should support drop stats similar to impala.
> https://impala.apache.org/docs/build/html/topics/impala_drop_stats.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to