[
https://issues.apache.org/jira/browse/HIVE-25293?focusedWorklogId=643248&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-643248
]
ASF GitHub Bot logged work on HIVE-25293:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 30/Aug/21 00:09
Start Date: 30/Aug/21 00:09
Worklog Time Spent: 10m
Work Description: github-actions[bot] commented on pull request #2434:
URL: https://github.com/apache/hive/pull/2434#issuecomment-907906173
This pull request has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the [email protected] list if the patch is in
need of reviews.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 643248)
Time Spent: 1h (was: 50m)
> Alter partitioned table with "cascade" option create too many columns records.
> ------------------------------------------------------------------------------
>
> Key: HIVE-25293
> URL: https://issues.apache.org/jira/browse/HIVE-25293
> Project: Hive
> Issue Type: Improvement
> Components: Metastore
> Affects Versions: 2.3.3, 3.1.2
> Reporter: yongtaoliao
> Assignee: yongtaoliao
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> When alter partitioned table with "cascade" option, all partitions supports
> to be updated. Currently, a CD_ID will be created for each partition,
> associated with a set of Columns, which will cause a large amount of
> redundant data in the metadata database.
> The following DDL statements can reproduce this scenario:
>
> {code:java}
> create table test_table (f1 int) partitioned by (p string);
> alter table test_table add partition(p='a');
> alter table test_table add partition(p='b');
> alter table test_table add partition(p='c');
> alter table test_table add columns (f2 int) cascade;{code}
> All partitions use the table's `CD_ID` before adding columns, while each
> partition use their own `CD_ID` after adding columns.
>
> My proposal is all partitions should use the same `CD_ID` when table was
> altered with "cascade" option.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)