[
https://issues.apache.org/jira/browse/HADOOP-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473216
]
Raghu Angadi commented on HADOOP-990:
-------------------------------------
When HADOOP-940 is fixed, this issue becomes less critical.
Currently, DataNode picks a volume (disk) which satisfies 'volume.freespace >=
blocksize()'. I think we can change that to two loops where first loop checks
for 'volume.freespace >= 10*blockSize'. If no volume is found, next loop checks
for 'freespace >= blocksize()'. This will reduce number of such exceptions. 10
might still be too small.. may be 100?
> Datanode doesn't retry when write to one (full)drive fail
> ---------------------------------------------------------
>
> Key: HADOOP-990
> URL: https://issues.apache.org/jira/browse/HADOOP-990
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: Koji Noguchi
> Assigned To: Raghu Angadi
>
> When one drive is 99.9% full and datanode choose that drive to write, it
> fails with
> 2007-02-07 18:16:56,574 WARN org.apache.hadoop.dfs.DataNode: DataXCeiver
> org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: No space left on
> device
> at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:801)
> at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:563)
> at java.lang.Thread.run(Thread.java:595)
> Combined with HADOOP-940, these failed blocks stay under-replicated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.