[ 
https://issues.apache.org/jira/browse/HIVE-21100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu-Wen Lai updated HIVE-21100:
------------------------------
    Resolution: Workaround
        Status: Resolved  (was: Patch Available)

While testing out the original patch, I found workarounds as below.
 # set hive.exec.dynamic.partition.mode = nonstrict;
This setting is for partitioned tables. It will make the query plan use one 
more reducer, which doesn't generate HIVE_UNION_SUBDIR.
 # set hive.merge.tezfiles=true;
This setting is for non-partitioned tables. The FileMerge task will flatten the 
HIVE_UNION_SUBDIR.

Now that we have workarounds, we don't need to introduce additional file system 
calls that have a negative performance impact on every insert query.

> Allow flattening of table subdirectories resulted when using TEZ engine and 
> UNION clause
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-21100
>                 URL: https://issues.apache.org/jira/browse/HIVE-21100
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: George Pachitariu
>            Assignee: George Pachitariu
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HIVE-21100.1.patch, HIVE-21100.2.patch, 
> HIVE-21100.3.patch, HIVE-21100.patch
>
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Right now, when writing data into a table with Tez engine and the clause 
> UNION ALL is the last step of the query, Hive on Tez will create a 
> subdirectory for each branch of the UNION ALL.
> With this patch the subdirectories are removed, and the files are renamed and 
> moved to the parent directory.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to