[ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420829#comment-13420829
 ] 

nkeywal commented on HDFS-3702:
-------------------------------

bq. stack and nkeywal, when a node dies, there is a correlated failure and 
replica count goes down to two. Is this a big problem? HDFS does create an 
additional replica right?
It's not a big problem by itself, just that the real replication count is 2, so 
it's less safe than 3. Hence the priority set to minor :-)
It adds other major & critical problems (hence the other jiras), because we try 
do use the dead node during the recovery (one chance out of 3 per block), so we 
have increase delays when we recover. As the recovery is distributed, we can be 
quite sure than one of the reader will take an added delay, may be multiple 
times, as there are multiple files.

We will be able to manage this by setting priorities on blocks, but it would be 
simpler to write the blocks somewhere else instead of skipping then during 
reads... So I would see this as the best medium term option, for example on 
branch-2.



                
> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3702
>                 URL: https://issues.apache.org/jira/browse/HDFS-3702
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 1.0.3, 2.0.0-alpha
>            Reporter: nkeywal
>            Priority: Minor
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to