[ 
https://issues.apache.org/jira/browse/HDFS-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-1238.
------------------------------------

    Resolution: Incomplete

Stale.

> A block is stuck in ongoingRecovery due to exception not propagated 
> --------------------------------------------------------------------
>
>                 Key: HDFS-1238
>                 URL: https://issues.apache.org/jira/browse/HDFS-1238
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 0.20.1
>            Reporter: Thanh Do
>
> - Setup:
> +  # datanodes = 2
> + replication factor = 2
> + failure type = transient (i.e. a java I/O call that throws I/O Exception or 
> returns false)
> + # failures = 2
> + When/where failures happen: (This is a subtle bug) The first failure is a 
> transient failure at a datanode during the second phase. Due to the first 
> failure, the DFSClient will call recoverBlock.  The second failure is 
> injected during this recover block process (i.e. another failure during the 
> recovery process).
>  
> - Details:
>  
> The expectation here is that since the DFSClient performs lots of retries,
> two transient failures should be masked properly by the retries.
> We found one case, where the failures are not transparent to the users.
>  
> Here's the stack trace of when/where the two failures happen (please ignore 
> the line number).
>  
> 1. The first failure:
> Exception is thrown at
>     call(void java.io.DataOutputStream.flush())
>     SourceLoc: org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java(252)
> Stack Trace:
>       [0] datanode.BlockReceiver (flush:252)
>       [1] datanode.BlockReceiver (receivePacket:660)
>       [2] datanode.BlockReceiver (receiveBlock:743)
>       [3] datanode.DataXceiver (writeBlock:468)
>       [4] datanode.DataXceiver (run:119)
>  
> 2. The second failure:
> False is returned at
>    call(boolean java.io.File.renameTo(File))
>    SourceLoc: org/apache/hadoop/hdfs/server/datanode/FSDataset.java(105)
> Stack Trace:
>       [0] datanode.FSDataset (tryUpdateBlock:1008)
>       [1] datanode.FSDataset (updateBlock:859)
>       [2] datanode.DataNode (updateBlock:1780)
>       [3] datanode.DataNode (syncBlock:2032)
>       [4] datanode.DataNode (recoverBlock:1962)
>       [5] datanode.DataNode (recoverBlock:2101)
>  
> This is what we found out:
> The first failure causes the DFSClient to somehow calls recoverBlock,
> which will force us to see the 2nd failure. The 2nd failure makes
> renameTo returns false, which then causes an IOException to be thrown
> from the function that calls renameTo.
> But this IOException is not propagated properly!
> It is dropped inside DN.syncBlock(). Specifically DN.syncBlock
> calls DN.updateBlock() which gets the exception. But syncBlock
> only catches that and prints a warning without propagating the exception
> properly.  Thus syncBlock returns without any exception,
> and thus recoverBlock returns without executing the finally{} block
> (see below).
>  
> Now, the client will retry recoverBlock for 3-5 more times,
> but this retries always see exceptions! The reason is that the first
> time we call recoverBlock(blk), this blk is being put into
> an ongoingRecovery list inside DN.recoverBlock().
> Normally, blk is only removed (ongoingRecovery.remove(blk)) inside the 
> finally{} block.
> But since the exception is not propagated properly, this finally{}
> block is never called, thus the blk is stuck
> forever inside the ongoingRecovery list, and hence the next time
> client performs the retry, it gets this error message
> "Block ... is already being recovered" and recoverBlock() throws
> IOException.  As a result, the client which calls this whole
> process in the context of processDatanodeError will return
> from the pde function with closed = true, and hence it never
> retries the whole thing again from the beginning, and instead
> just returns error.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (than...@cs.wisc.edu) and
> Haryadi Gunawi (hary...@eecs.berkeley.edu)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to