[
https://issues.apache.org/jira/browse/HDFS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985184#action_12985184
]
Konstantin Boudnik commented on HDFS-1566:
------------------------------------------
Right, manual reproduction isn't an issue I believe (thanks for the tip of
having logs along with the data - that should help for sure).
After looking into this problem somewhat more it became apparent that FI is
unlikely to help. For once, IOException for an out-of-space disk isn't specific
enough and it'd be hard to fine-tune the injection. I am not saying it isn't
possible, but it seems to be a way harder that trying to reproduce the issue at
the system test level (e.g. with loop-device or in-memory partition, etc.)
> Test that covers full partition
> ---------------------------------
>
> Key: HDFS-1566
> URL: https://issues.apache.org/jira/browse/HDFS-1566
> Project: Hadoop HDFS
> Issue Type: Test
> Components: name-node
> Affects Versions: 0.20.2
> Reporter: Eli Collins
> Assignee: Konstantin Boudnik
> Fix For: 0.23.0
>
>
> We've seen the following bug, hdfs needs a test to reproduce this:
> * /var filled up
> * 2NN failed checkpoint due to no space left on device
> * NN log hit end of disk
> * NN seems to have exited on the spot, mid-log-message
> * NN edits are left corrupted
> ** Half of a rename made it into the log
> ** valid data appears to end on a sector boundary
> ** this is true across all of the edit dirs
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.