Wechar created HIVE-27725:
-----------------------------

             Summary: Remove redundant columns in TAB_COL_STATS and 
PART_COL_STATS
                 Key: HIVE-27725
                 URL: https://issues.apache.org/jira/browse/HIVE-27725
             Project: Hive
          Issue Type: Improvement
          Components: Hive
    Affects Versions: 4.0.0-beta-1
            Reporter: Wechar
            Assignee: Wechar


{{TAB_COL_STATS}} table includes {{CAT_NAME}}, {{DB_NAME}} and {{TABLE_NAME}}, 
which can be fetched by join {{TBLS}} and {{DBS}} tables on {{TBL_ID}} and 
{{DB_ID}} columns. 

{{PART_COL_STATS}} table includes {{CAT_NAME}}, {{DB_NAME}}, {{TABLE_NAME}} and 
{{PARTITION_NAME}}, which can be fetched by join {{PARTITIONS}}, {{TBLS}} and 
{{DBS}} tables on {{PART_ID}}, {{TBL_ID}} and {{DB_ID}}.

In addition, current HMS get table statistics without join other table, while 
delete table statistics with join {{TBLS}}. This inconsistency will result 
exception if in a corner case where some table column statistics were left when 
drop table, then the user recreate the table with same name and database name 
but will get another {{TBL_ID}}, in this case user will get the old table 
column statistics incorrectly. And if user try delete stats fetched by get api, 
the {{NoSuchObjectException}} will be thrown.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to