[
https://issues.apache.org/jira/browse/HIVE-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302260#comment-15302260
]
Ashutosh Chauhan commented on HIVE-13850:
-----------------------------------------
Whatever name you chose you will always be susceptible to [TOCTTOU issue |
https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use] since name is
chosen by different process (hive cli) then the one doing renames (Namenode)
Until HDFS adds merge api (HDFS-9763) best way to handle this scenario is to
turn on locking https://cwiki.apache.org/confluence/display/Hive/Locking
> File name conflict when have multiple INSERT INTO queries running in parallel
> -----------------------------------------------------------------------------
>
> Key: HIVE-13850
> URL: https://issues.apache.org/jira/browse/HIVE-13850
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.2.1
> Reporter: Bing Li
> Assignee: Bing Li
> Attachments: HIVE-13850-1.2.1.patch
>
>
> We have an application which connect to HiveServer2 via JDBC.
> In the application, it executes "INSERT INTO" query to the same table.
> If there are a lot of users running the application at the same time. Some of
> the INSERT could fail.
> The root cause is that in Hive.checkPaths(), it uses the following method to
> check the existing of the file. But if there are multiple inserts running in
> parallel, it will led to the conflict.
> for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest);
> counter++) {
> itemDest = new Path(destf, name + ("_copy_" + counter) +
> filetype);
> }
> The Error Message
> ===========================
> In hive log,
> org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error
> while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met
>
> adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46-
> 23_642_2056172497900766879-3321/-ext-10000/000000_0 to
> hdfs://node:8020/apps/hive
> /warehouse/metadata.db/scalding_stats/000000_0_copy_9014
> at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:
> 2719)
> at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:
> 1645)
>
> In hadoop log,
> WARN hdfs.StateChange (FSDirRenameOp.java:
> unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo:
> failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive-
> staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext-
> 10000/000000_0 to /apps/hive/warehouse/metadata.
> db/scalding_stats/000000_0_copy_9014 because destination exists
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)