[
https://issues.apache.org/jira/browse/HIVE-26110?focusedWorklogId=752340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-752340
]
ASF GitHub Bot logged work on HIVE-26110:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Apr/22 16:04
Start Date: 04/Apr/22 16:04
Worklog Time Spent: 10m
Work Description: szlta opened a new pull request, #3174:
URL: https://github.com/apache/hive/pull/3174
Bulk insert into partitioned table creates lots of files in iceberg, because
the SortedDynPartitionOptimizer doesn't set the key->reducer affinity that
could be done by just marking the sort expressions as 'partition' columns.
Issue Time Tracking
-------------------
Worklog Id: (was: 752340)
Remaining Estimate: 0h
Time Spent: 10m
> bulk insert into partitioned table creates lots of files in iceberg
> -------------------------------------------------------------------
>
> Key: HIVE-26110
> URL: https://issues.apache.org/jira/browse/HIVE-26110
> Project: Hive
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> For e.g, create web_returns table in tpcds in iceberg format and try to copy
> over data from regular table. More like "insert into web_returns_iceberg as
> select * from web_returns".
> This inserts the data correctly, however there are lot of files present in
> each partition. IMO, dynamic sort optimisation isn't working fine and this
> causes records not to be grouped in the final phase.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)