Datanode log looks fine. There was an error while writing to mirrors
when the data was first written, which can happen sometimes. It is still
not clear why namenode did not try to replicate these blocks until the
next restart.
How big is the cluster?
Raghu.
Chris Kline wrote:
Ah, yes, very interesting. I get a "Read timed out" on 10.100.11.31,
followed by a bunch of "Served block" messages... then finally a
Transmitted block once HDFS was restarted.
hadoop-rapleaf-datanode-tf4.rapleaf.com.log.2008-01-02:2008-01-02
17:08:17,057 INFO org.apache.hadoop.dfs.DataNode: Received block
blk_4522585614366970680 from /10.100.11.31 and Read timed out
hadoop-rapleaf-datanode-tf4.rapleaf.com.log.2008-01-02:2008-01-02
17:13:17,281 INFO org.apache.hadoop.dfs.DataNode: Served block
blk_4522585614366970680 to /10.100.11.31
------ a bunch of the above message
hadoop-rapleaf-datanode-tf4.rapleaf.com.log.2008-01-02:2008-01-02
18:53:18,737 INFO org.apache.hadoop.dfs.DataNode: Served block
blk_4522585614366970680 to /10.100.11.31
------ HDFS restarted
hadoop-rapleaf-datanode-tf4.rapleaf.com.log.2008-01-03:2008-01-03
16:17:59,637 INFO org.apache.hadoop.dfs.DataNode: Starting thread to
transfer block blk_4522585614366970680 to
[Lorg.apache.hadoop.dfs.DatanodeInfo;@7e9ffe3f
hadoop-rapleaf-datanode-tf4.rapleaf.com.log.2008-01-03:2008-01-03
16:17:59,718 INFO org.apache.hadoop.dfs.DataNode: Transmitted block
blk_4522585614366970680 to /10.100.11.59:7277