[ 
https://issues.apache.org/jira/browse/HDFS-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-5185:
--------------------------------

    Attachment: HDFS-5185-002.patch

Attaching the updated patch.
after recent changes {{checkDiskError()}}  will trigger one periodic thread 
which will check for disk error asynchronously. But this issue requires 
synchronous check for errors before initializing block pools.
Accordingly, checking for errors synchronously before initializing block pools 
to exclude failed disks to avoid startup failures.

Please review.

> DN fails to startup if one of the data dir is full
> --------------------------------------------------
>
>                 Key: HDFS-5185
>                 URL: https://issues.apache.org/jira/browse/HDFS-5185
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>            Priority: Critical
>         Attachments: HDFS-5185-002.patch, HDFS-5185.patch
>
>
> DataNode fails to startup if one of the data dirs configured is out of space. 
> fails with following exception
> {noformat}2013-09-11 17:48:43,680 FATAL 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
> block pool Block pool <registering> (storage id 
> DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110
> java.io.IOException: Mkdirs failed to create 
> /opt/nish/data/current/BP-123456-1234567/tmp
>         at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.<init>(BlockPoolSlice.java:105)
>         at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216)
>         at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155)
>         at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> It should continue to start-up with other data dirs available.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to