DFSClient errors when a block replica disappears could be more informative and
less scary.
------------------------------------------------------------------------------------------
Key: HDFS-1631
URL: https://issues.apache.org/jira/browse/HDFS-1631
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client
Affects Versions: 0.20-append
Reporter: Daniel Einspanjer
Priority: Minor
When HBase starts up, it opens a ton of files from HDFS. If you are closely
monitoring the startup, you will see lots of the following example of logs on
the client and remote DN. The exception is logged at an INFO level on the
client side, but at a WARN and ERROR on the DN side. In both cases, it would
be nice to put a little more context into the error and reassure the user that
it isn't anything to be worried about if the error doesn't persist. Also, why
bother with a complete stack trace?
2011-02-17 13:46:30,995 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_OPEN:
crash_reports,81101
138606ab7a-85e4-416f-8cd9-bb4002110113,1294954735047.184604411
2011-02-17 13:46:31,056 INFO org.apache.hadoop.hdfs.DFSClient: Failed to
connect to /10.8.100.16:50010, add to deadNodes and
continue
java.io.IOException: Got error in response to OP_READ_BLOCK
self=/10.8.100.13:54349, remote=/10.8.100.16:50010 for file /hbas
e/crash_reports/184604411/meta_data/8999005319667206647 for block
-3256062388303907746_2989166
at
org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1471)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1794)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1931)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readFully(DataInputStream.java:152)
at
org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1492)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:860)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:805)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Reader.loadFileInfo(StoreFile.java:977)
at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:387)
at
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438)
at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:245)
at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:187)
at
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1923)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:334)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1588)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1553)
at
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1465)
at java.lang.Thread.run(Thread.java:619)
2011-02-17 13:46:31,456 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Onlined crash_reports,81101138606ab7a-85e4-416f-8c
d9-bb4002110113,1294954735047.184604411; next sequenceid=4358482563
On node .16:
2011-02-17 13:46:31,056 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.8.100.16:50010,
storageID=DS-2053327189-10.8.100.16-50010-1291655260617, infoPort=50075,
ipcPort=50020):Got exception while serving blk_-3256062388303907746_2989166 to
/10.8.100.13:
java.io.IOException: Block blk_-3256062388303907746_2989166 is not valid.
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:976)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:939)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getVisibleLength(FSDataset.java:949)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:94)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:206)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:114)
2011-02-17 13:46:31,056 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.8.100.16:50010,
storageID=DS-2053327189-10.8.100.16-50010-1291655260617, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-3256062388303907746_2989166 is not valid.
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:976)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:939)
at
org.apache.hadoop.hdfs.server.datanode.FSDataset.getVisibleLength(FSDataset.java:949)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:94)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:206)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:114)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira