Hi all, I am running the hadoop-0.19.1 and met strange problem in these days. Several days before, hadoop run smoothly and three nodes have been running TaskTracker and DataNode deamons. However, one of node can not start DataNode after I moved them to another place.
I have checked the network and firewall. The network is ok because ssh can ship me from master to all the slaves. And, firewall is not activated in all the machines. In the shutdown node, runing "jps" can only find TaskTracker but not found DateNode. I checked the log and out files in logs/, and found the following message: ---------------- logs/hadoop-datanode-hdt1.mycluster.com.log--------- ... 2009-05-30 14:58:54,830 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting down: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node 10.61.0.143:50010 is attempting to report storage ID DS-983240698-127.0.0.1-50010-1236515374222. Node 10.61.0.5:50010 is expected to serve this storage. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDatanode(FSNamesystem.java:3800) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processReport(FSNamesystem.java:2801) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReport(NameNode.java:636) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) ... 2009-05-30 14:58:54,993 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2009-05-30 14:58:54,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 1 2009-05-30 14:58:54,994 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.61.0.143:50010, storageID=DS-983240698-127.0.0.1-50010-1236515374222, infoPort=50075, ipcPort=50020):DataXceiveServer: java.nio.channels.AsynchronousCloseException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:152) at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84) at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:130) at java.lang.Thread.run(Thread.java:619) ... infoPort=50075, ipcPort=50020):Finishing DataNode in: FSDataset{dirpath='/home/hadoop/myhadoop2/hadoop-hdfs/data/current'} 2009-05-30 14:58:56,096 INFO org.apache.hadoop.ipc.Server: Stopping server on 50020 2009-05-30 14:58:56,096 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 0 2009-05-30 14:58:56,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at hdt1.mycluster.com/10.61.0.143 ************************************************************/ ------------------------------------------------------------------------------------------------ I am wondering what is wrong in my configuration? I just shutdown the node machines and move to another place and not anything have been changed in configuration and OS and software. And, is hadoop affected by network topology (for example, do all the nodes need to be in same area controlled by same hub) ? Any help? Thanks again, Ian