[
https://issues.apache.org/jira/browse/HIVE-27494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhihua Deng resolved HIVE-27494.
--------------------------------
Fix Version/s: 4.0.0-beta-1
Resolution: Fixed
Fix has been merged. Thanks [~dkuzmenko], [~kkasa] and [~zabetak] for the
review!
> Deduplicate the task result that generated by more branches in union all
> ------------------------------------------------------------------------
>
> Key: HIVE-27494
> URL: https://issues.apache.org/jira/browse/HIVE-27494
> Project: Hive
> Issue Type: Bug
> Reporter: Zhihua Deng
> Assignee: Zhihua Deng
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
> Attachments: ddl.q, explain.output
>
>
> HIVE-23891 adds the ability to deduplicate the task result that under the
> directory,
> <table-dir>/<staging-dir>/_tmp.-ext-10000/<dynamic-partition-dir>/HIVE_UNION_SUBDIR_1,
> but turns out to ignore taking the same action to the directory for the same
> query:
> <table-dir>/<staging-dir>/_tmp.-ext-10000/<dynamic-partition-dir>/HIVE_UNION_SUBDIR_2.
> So user may still have the same data duplication problem in multiple tez task
> attempts.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)