[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15593182#comment-15593182
 ] 

Gopal V commented on HIVE-14535:
--------------------------------

bq.  Was Hive modified to force each task attempt to write to the same file?

No, the file name choice was the product of hive bucketing. Due to the write 
once, rename twice (_tmp -> task dir, task dir -> table dir), this was not a 
problem until someone tried to write directly.

bq.  In that case what was the exact issue with checksum-safety?

The writers can't "win" till they have consumed the last byte of their shuffle, 
which is the point where one of them gets to find out they had corrupted data 
(because the checksum does not match).

> add micromanaged tables to Hive (metastore keeps track of the files)
> --------------------------------------------------------------------
>
>                 Key: HIVE-14535
>                 URL: https://issues.apache.org/jira/browse/HIVE-14535
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to