[ 
https://issues.apache.org/jira/browse/HDFS-6871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104458#comment-14104458
 ] 

Colin Patrick McCabe commented on HDFS-6871:
--------------------------------------------

OK, so if I understand [~daryn] and [~umamaheswararao]'s comments, the concern 
is that the NameNode might tell the DataNodes to delete the blocks that compose 
the file prior to doing the logsync, and then crash without doing a sync.

This is pretty unlikely, and the harm is just that the file that was going to 
be replaced has 0 replicas for its blocks after the NN restarts.  I guess this 
is alarming to sysadmins, but from a user's point of view it's not that much 
different from the file having zero length... either way, the old data is gone 
and there is no new data.  Still, I agree that this is something we should 
avoid since it will cause alarm to sysadmins.

If we're going to deal with this case correctly, is it easier to just use 
[~jingzhao]'s suggestion of  using one single editlog record for 
(createFile+overwrite)?  The current patch feels kind of hacky, or maybe that's 
just me.

> Improve NameNode performance when creating file  
> -------------------------------------------------
>
>                 Key: HDFS-6871
>                 URL: https://issues.apache.org/jira/browse/HDFS-6871
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode, performance
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>            Priority: Critical
>             Fix For: 2.6.0
>
>         Attachments: HDFS-6871.001.patch, HDFS-6871.002.patch, 
> HDFS-6871.003.patch
>
>
> Creating file with overwrite flag will cause NN fall into flush edit logs and 
> block other requests if the file exists.
> When we create a file with overwrite flag (default is true) in HDFS, NN will 
> remove original file if it exists. In FSNamesystem#startFileInternal, NN 
> already holds the write lock, it calls {{deleteInt}} if the file exists, 
> there is logSync in {{deleteInt}}. So in this case, logSync is under write 
> lock, it will heavily affect the NN performance. 
> We should ignore the force logSync in {{deleteInt}} in this case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to