[ 
https://issues.apache.org/jira/browse/HDFS-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800915#action_12800915
 ] 

dhruba borthakur commented on HDFS-822:
---------------------------------------

If there is space in the same volume, then the rename occur within the same 
volume. In this case, holding the FSDataset lock across the rename is ok.

If there is no space, then the rename is actually a datacopy. In this case, we 
should not hold the FSDataset lock across the rename. Moreover, rename across 
two mount points is not atomic, ie. if the datanode dies in the middle of a 
rename, we can have a partial file in the destination as well as the complete 
file in the source, isn't it?

> Appends to already-finalized blocks can rename across volumes
> -------------------------------------------------------------
>
>                 Key: HDFS-822
>                 URL: https://issues.apache.org/jira/browse/HDFS-822
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.21.0, 0.22.0
>
>         Attachments: HDFS-822.patch
>
>
> This is a performance thing. As I understand the code in FSDataset.append, if 
> the block is already finalized, it needs to move it into the RBW directory so 
> it can go back into a "being written" state. This is done using 
> volumes.getNextVolume without preference to the volume that the block 
> currently exists on. It seems to me that this could cause a lot of slow 
> cross-volume copies on applications that periodically 
> append/close/append/close a file. Instead, getNextVolume could provide an 
> alternate form that gives preference to a particular volume, so the rename 
> stays on the same disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to