[
https://issues.apache.org/jira/browse/HIVE-26674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Végh updated HIVE-26674:
-------------------------------
Description: A new compaction type is required for implicitly bucketed
tables. These tables can have balancing issues over time, in a way that the
first few buckets contain the majority of the data, while the buckets with
higher index contain less and less data. As a result, query performance will
drop over time on these unbalanced tables. To solve this issue, the data
periodically needs to be re-balanced among the buckets. The plain is to do this
via a new RE-BALANCING compaction. This compaction can be issued either
manually by users, or automatically by the Initiator. The automatic
re-balancing compaction must be based on evaluating a set of thresholds.
> REBALANCE type compaction
> -------------------------
>
> Key: HIVE-26674
> URL: https://issues.apache.org/jira/browse/HIVE-26674
> Project: Hive
> Issue Type: Improvement
> Reporter: László Végh
> Assignee: László Végh
> Priority: Major
>
> A new compaction type is required for implicitly bucketed tables. These
> tables can have balancing issues over time, in a way that the first few
> buckets contain the majority of the data, while the buckets with higher index
> contain less and less data. As a result, query performance will drop over
> time on these unbalanced tables. To solve this issue, the data periodically
> needs to be re-balanced among the buckets. The plain is to do this via a new
> RE-BALANCING compaction. This compaction can be issued either manually by
> users, or automatically by the Initiator. The automatic re-balancing
> compaction must be based on evaluating a set of thresholds.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)