[
https://issues.apache.org/jira/browse/KYLIN-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928301#comment-17928301
]
Guoliang Sun commented on KYLIN-6025:
-------------------------------------
h3. Steps to Reproduce the Issue
# Create a view.
##
{code:java}
CREATE VIEW kylin_sales_partition_view AS select trans_id%2 as partition_col, *
from kylin_sales {code}
# Load the view as an internal table, set `partition_col` as the partition
column, and load the data.
# there are two data files in the partition where `partition_col=1`.
> Support file merging within partitions for internal tables
> ----------------------------------------------------------
>
> Key: KYLIN-6025
> URL: https://issues.apache.org/jira/browse/KYLIN-6025
> Project: Kylin
> Issue Type: New Feature
> Affects Versions: 5.0.0
> Reporter: Guoliang Sun
> Priority: Major
>
> When multiple tasks write to the same internal table partition during the
> build phase, the data is written into multiple subdirectories, which can
> easily lead to an excessive number of files and increase HDFS pressure. A
> reasonable merging mechanism is needed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)