[ 
https://issues.apache.org/jira/browse/HIVE-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916249#action_12916249
 ] 

Ning Zhang commented on HIVE-1665:
----------------------------------

What about 2 failed and rolling back 1) also failed? This could happen if the 
CLI got killed at any time between 1) and 2). 

Another option is to use the traditional 'mark-then-delete' trick that you mark 
the partition as deleted on the metastore first and then clean up the data. In 
case of any failure, redoing the drop partiton will resume the data deletion 
process. It is also easier from the administrator's point of view that you can 
periodically check the metastore for deleted partitions (which are left 
uncommitted) and re-drop the partition. 

> drop operations may cause file leak
> -----------------------------------
>
>                 Key: HIVE-1665
>                 URL: https://issues.apache.org/jira/browse/HIVE-1665
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: hive-1665.1.patch
>
>
> Right now when doing a drop, Hive first drops metadata and then drops the 
> actual files. If file system is down at that time, the files will keep not 
> deleted. 
> Had an offline discussion about this:
> to fix this, add a new conf "scratch dir" into hive conf. 
> when doing a drop operation:
> 1) move data to scratch directory
> 2) drop metadata
> 3) if 2) failed, roll back 1) and report error 3.1
>     if 2) succeeded, drop data from scratch directory 3.2
> 4) if 3.2 fails, we are ok because we assume the scratch dir will be emptied 
> manually.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to