[
https://issues.apache.org/jira/browse/HIVE-25295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421137#comment-17421137
]
Zhihua Deng edited comment on HIVE-25295 at 9/28/21, 3:14 AM:
--------------------------------------------------------------
Have you tried HIVE-17963, there is a catch for runaway processes adding
additional files into staging directory.
was (Author: dengzh):
Have you tried [HIVE-17963](https://issues.apache.org/jira/browse/HIVE-17963),
there is a catch for runaway processes adding additional files into staging
directory.
> "File already exist exception" during mapper/reducer retry with old hive(0.13)
> ------------------------------------------------------------------------------
>
> Key: HIVE-25295
> URL: https://issues.apache.org/jira/browse/HIVE-25295
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 0.13.0
> Reporter: yuquan wang
> Priority: Blocker
>
> We are now using very old hive version(0.13) due to historical reason, and we
> often meet following issue:
> {code:java}
> Caused by: java.io.IOException: File already
> exists:s3://smart-dmp/warehouse/uploaded/ad_dmp_pixel/dt=2021-06-21/key=259f3XXXXXXX
> {code}
> We have investigated this issue for quite a long time, but didn't get a good
> fix, so I may want to ask the hive community for help to see if there are any
> solutions.
>
> The error is created during map/reduce stage, once an instance failed due to
> some unexpected reason(for example unstable spot instance got killed), then
> later retry will throw the above exception, instead of overwriting it.
>
> we have several guesses like following:
> 1. Is it caused by orc file type? I have found similar issue like
> https://issues.apache.org/jira/browse/HIVE-6341 but saw no comments there,
> and our table is stored as orc style.
> 2. Is the problem solved in the higher hive version? because we are also
> running hive 2.3.6, but didn't meet such an issue, so want to see if version
> upgrade can solve the issue?
> 3.Do we have such a config that supports always cleaning up existing folders
> during retry of mapper/reducer stage. I have searched all mapreduce config
> but can not find one.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)