[
https://issues.apache.org/jira/browse/HIVE-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16159338#comment-16159338
]
Sergey Shelukhin commented on HIVE-17403:
-----------------------------------------
+1
> Fail concatenation for unmanaged and transactional tables
> ---------------------------------------------------------
>
> Key: HIVE-17403
> URL: https://issues.apache.org/jira/browse/HIVE-17403
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.3.0, 3.0.0, 2.4.0
> Reporter: Prasanth Jayachandran
> Assignee: Prasanth Jayachandran
> Priority: Blocker
> Attachments: HIVE-17403.1.patch, HIVE-17403.2.patch
>
>
> ALTER TABLE .. CONCATENATE should fail if the table is not managed by hive.
> For unmanaged tables, file names can be anything. Hive has some assumptions
> about file names which can result in data loss for unmanaged tables.
> Example of this is a table/partition having 2 different files files
> (part-m-00000__1417075294718 and part-m-00018__1417075294718). Although both
> are completely different files, hive thinks these are files generated by
> separate instances of same task (because of failure or speculative
> execution). Hive will end up removing this file
> {code}
> 2017-08-28T18:19:29,516 WARN [b27f10d5-d957-4695-ab2a-1453401793df main]:
> exec.Utilities (:()) - Duplicate taskid file removed:
> file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00018__1417075294718
> with length 958510. Existing file:
> file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00000__1417075294718
> with length 1123116
> {code}
> DDL should restrict concatenation for unmanaged tables.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)