施兴 wrote:
HI2007-11-20 11:07:28,712 WARN dfs.DataNode - Failed to transfer blk_-3387595792800455675 to 192.168.140.244:50010 got java.net.SocketException: Connection reset at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.dfs.DataNode$BlockSender.sendChunk(DataNode.java:1175) at org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1208) at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java:1460) at java.lang.Thread.run(Thread.java:619) Although it is a WARN, but I wanna understand why? Is it the network problem?
This could indicate that the target datanode is running out of resources - either file handles, or CPU; you should check ulimit settings on the user id that runs the datanode process.
This issue could be intermittent, in which case the DFS will recover from it (hence the WARN and not FATAL).
I've also seen similar issues caused by a poor quality ethernet switch that dropped packets, or bad cabling.
-- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
