[ 
https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666358#action_12666358
 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

> Which design are we talking about anyway? The document attached to 1700 is 8 
> months behind the patch.

The design document for Appends is still a valid document. It is true that the 
patch took a long time to develop.

> I am concerned that incomplete blocks will be promoted, then sent (reported) 
> to the name-node, then processed there and finally most of them will be 
> removed. It's the name-node overhead which is a concern not the data-node.

Ok, so it is not about correctness, but rather a performance question. I will 
run some tests on how much this can add to performance overhead. Will report my 
findings soon. The reason I like promoting blocks to the real directory (only 
when the datanode crashes) is because this is data that an application has 
written and I would rather save it than delete it. From my viewpoint, the 
system should make every effort to persist this data, rather than saying that 
"ok, you did not invoke sync, so you lose your data". (I remember a discussion 
with Sameer saying that it would be nice to have every new block allocation at 
the namenode  be persisted, and persisting the block list at the namenode is 
useless if the datanode anyways deletes blocks that were not closed).

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, 
> handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  
> directory since these files are not valid anymore. But in 0.18 it moves these 
> files to normal directory incorrectly making them valid blocks. One of the 
> following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete 
> them.
> Currently effect of this bug is that, these files end up failing block 
> verification and eventually get deleted. But cause incorrect over-replication 
> at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be 
> defined better. Right now there are probably one or two more bugs with it. 
> Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to