hadoop 2.4.1
datanode disk failure.
'Number of Under-Replicated Blocks' is zero. 
After two disk failure will result in the loss of files. ( CORRUPT ) 
How do I fix it? 
 
 
1. dfshealth.html
Configured Capacity:    42.91 TB
DFS Used:       1.86 GB
Non DFS Used:   29.63 TB
DFS Remaining:  13.28 TB
DFS Used%:      0%
DFS Remaining%: 30.94%
Block Pool Used:        1.86 GB
Block Pool Used%:       0%
DataNodes usages% (Min/Median/Max/stdDev):      0.00% / 0.01% / 0.01% / 0.00%
Live Nodes      2 (Decommissioned: 0)
Dead Nodes      0 (Decommissioned: 0)
Decommissioning Nodes   0
Number of Under-Replicated Blocks       0
 
2. chmod 444 /raid0/data01 ( volume failure )
3. bin/hdfs dfs -get /t.mp4 /tmp/t4.mp4 ( read file )
4. namenode log ( volume failure )
2014-10-10 14:55:21,027 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: 
Disk error on DatanodeRegistration(192.168.55.151, 
datanodeUuid=b565d54d-0817-4aa5-884e-1e060179f43f, infoPort=40075, 
ipcPort=40020, storageInfo=lv=-55;cid=CID-TEST-ZONE;nsid=326408948;c=0): 
DataNode failed volumes:/raid0/data01/dfs/data/current;
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Reported block 
blk_1073741848_1024 on 192.168.55.151:40010 size 49940112 replicaState = 
FINALIZED
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: In memory 
blockUCState = COMPLETE
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Reported block 
blk_1073741842_1018 on 192.168.55.151:40010 size 134217728 replicaState = 
FINALIZED
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: In memory 
blockUCState = COMPLETE
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Reported block 
blk_1073741844_1020 on 192.168.55.151:40010 size 134217728 replicaState = 
FINALIZED
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: In memory 
blockUCState = COMPLETE
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Reported block 
blk_1073741846_1022 on 192.168.55.151:40010 size 134217728 replicaState = 
FINALIZED
2014-10-10 14:55:25,400 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: In memory 
blockUCState = COMPLETE
2014-10-10 14:55:25,431 INFO BlockStateChange: BLOCK* processReport: from 
storage DS-4de98631-ddec-4118-8654-2961b1815230 node 
DatanodeRegistration(192.168.55.151, 
datanodeUuid=b565d54d-0817-4aa5-884e-1e060179f43f, infoPort=40075, 
ipcPort=40020, storageInfo=lv=-55;cid=CID-TEST-ZONE;nsid=326408948;c=0), 
blocks: 4, processing time: 32 msecs
 
5. datanode log ( volume failure )
2014-10-10 14:55:21,473 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
failed volume /raid0/data01/dfs/data/current: 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Can not create 
directory: 
/raid0/data01/dfs/data/current/BP-1269062812-127.0.0.1-1412645127175/current/finalized
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:91)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.LDir.checkDirTree(LDir.java:160)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.checkDirs(BlockPoolSlice.java:255)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.checkDirs(FsVolumeImpl.java:209)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.checkDirs(FsVolumeList.java:168)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.checkDataDir(FsDatasetImpl.java:1317)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:1421)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.validateBlockFile(FsDatasetImpl.java:1117)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:350)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:343)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:150)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:265)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:493)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
        at java.lang.Thread.run(Thread.java:662)
2014-10-10 14:55:21,491 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
write dfsUsed to 
/raid0/data01/dfs/data/current/BP-1269062812-127.0.0.1-1412645127175/current/dfsUsed
java.io.FileNotFoundException: 
/raid0/data01/dfs/data/current/BP-1269062812-127.0.0.1-1412645127175/current/dfsUsed
 (Permission denied)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:145)
        at java.io.FileWriter.<init>(FileWriter.java:73)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.saveDfsUsed(BlockPoolSlice.java:213)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.shutdown(BlockPoolSlice.java:424)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:252)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.checkDirs(FsVolumeList.java:175)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.checkDataDir(FsDatasetImpl.java:1317)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:1421)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.validateBlockFile(FsDatasetImpl.java:1117)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:350)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:343)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:150)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:265)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:493)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
        at java.lang.Thread.run(Thread.java:662)
2014-10-10 14:55:21,494 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Completed 
checkDirs. Removed 1 volumes. Current volumes: [/raid0/data02/dfs/data/current]
2014-10-10 14:55:21,494 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
replica BP-1269062812-127.0.0.1-1412645127175:1073741841 on failed volume 
/raid0/data01/dfs/data/current
2014-10-10 14:55:21,494 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
replica BP-1269062812-127.0.0.1-1412645127175:1073741843 on failed volume 
/raid0/data01/dfs/data/current
2014-10-10 14:55:21,494 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
replica BP-1269062812-127.0.0.1-1412645127175:1073741845 on failed volume 
/raid0/data01/dfs/data/current
2014-10-10 14:55:21,494 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing 
replica BP-1269062812-127.0.0.1-1412645127175:1073741847 on failed volume 
/raid0/data01/dfs/data/current
2014-10-10 14:55:21,495 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removed 4 
out of 8(took 0 millisecs)
2014-10-10 14:55:21,495 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DataNode.handleDiskError: Keep Running: true
2014-10-10 14:55:22,414 DEBUG 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
b=blk_1073741841_1017, 
f=/raid0/data01/dfs/data/current/BP-1269062812-127.0.0.1-1412645127175/current/finalized/blk_1073741841
2014-10-10 14:55:22,414 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opReadBlock BP-1269062812-127.0.0.1-1412645127175:blk_1073741841_1017 received 
exception java.io.IOException: Block blk_1073741841_1017 is not valid.
2014-10-10 14:55:22,449 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(192.168.55.151, 
datanodeUuid=b565d54d-0817-4aa5-884e-1e060179f43f, infoPort=40075, 
ipcPort=40020, storageInfo=lv=-55;cid=CID-TEST-ZONE;nsid=326408948;c=0):Got 
exception while serving 
BP-1269062812-127.0.0.1-1412645127175:blk_1073741841_1017 to 
/192.168.55.151:53669
java.io.IOException: Block blk_1073741841_1017 is not valid.
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:352)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:343)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:150)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:265)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:493)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
        at java.lang.Thread.run(Thread.java:662)
2014-10-10 14:55:22,449 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
namenode02:40010:DataXceiver error processing READ_BLOCK operation  src: 
/192.168.55.151:53669 dst: /192.168.55.151:40010
java.io.IOException: Block blk_1073741841_1017 is not valid.
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:352)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:343)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:150)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:265)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:493)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
        at java.lang.Thread.run(Thread.java:662)
 
6. dfshealth.html
Configured Capacity:    42.91 TB
DFS Used:       1.36 GB
Non DFS Used:   29.62 TB
DFS Remaining:  13.28 TB
DFS Used%:      0%
DFS Remaining%: 30.96%
Block Pool Used:        1.36 GB
Block Pool Used%:       0%
DataNodes usages% (Min/Median/Max/stdDev):      0.00% / 0.00% / 0.00% / 0.00%
Live Nodes      2 (Decommissioned: 0)
Dead Nodes      0 (Decommissioned: 0)
Decommissioning Nodes   0
Number of Under-Replicated Blocks       0
 
 

Reply via email to