[
https://issues.apache.org/jira/browse/HIVE-21100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yu-Wen Lai updated HIVE-21100:
------------------------------
Resolution: Workaround
Status: Resolved (was: Patch Available)
While testing out the original patch, I found workarounds as below.
# set hive.exec.dynamic.partition.mode = nonstrict;
This setting is for partitioned tables. It will make the query plan use one
more reducer, which doesn't generate HIVE_UNION_SUBDIR.
# set hive.merge.tezfiles=true;
This setting is for non-partitioned tables. The FileMerge task will flatten the
HIVE_UNION_SUBDIR.
Now that we have workarounds, we don't need to introduce additional file system
calls that have a negative performance impact on every insert query.
> Allow flattening of table subdirectories resulted when using TEZ engine and
> UNION clause
> ----------------------------------------------------------------------------------------
>
> Key: HIVE-21100
> URL: https://issues.apache.org/jira/browse/HIVE-21100
> Project: Hive
> Issue Type: Improvement
> Reporter: George Pachitariu
> Assignee: George Pachitariu
> Priority: Minor
> Labels: pull-request-available
> Attachments: HIVE-21100.1.patch, HIVE-21100.2.patch,
> HIVE-21100.3.patch, HIVE-21100.patch
>
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> Right now, when writing data into a table with Tez engine and the clause
> UNION ALL is the last step of the query, Hive on Tez will create a
> subdirectory for each branch of the UNION ALL.
> With this patch the subdirectories are removed, and the files are renamed and
> moved to the parent directory.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)