Dear All, I am trying to install HOD on a cluster. When I tried to allocate a new Hadoop cluster, I got the following error:
[2010-04-08 13:47:25,304] CRITICAL/50 hadoop:303 - Cluster could not be allocated because of the following errors. Hodring at n0 failed with following errors: JobTracker failed to initialise *The log file ringmaster.log has the following message:* [2010-04-08 13:46:22,297] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-04-08 13:46:22,299] DEBUG/10 ringMaster:487 - getServiceAddr service: <hodlib.GridServices.hdfs.Hdfs instance at 0x2057b758> [2010-04-08 13:46:22,300] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found *The log file hodring.log has the following message:* [2010-04-08 13:46:31,749] DEBUG/10 hodRing:416 - hadoopThread still == None ... [2010-04-08 13:46:31,750] DEBUG/10 hodRing:419 - hadoop input: None [2010-04-08 13:46:31,752] DEBUG/10 hodRing:428 - isForground: False [2010-04-08 13:46:31,753] DEBUG/10 hodRing:440 - hadoop run status: True [2010-04-08 13:46:31,754] DEBUG/10 hodRing:657 - Waiting for jobtracker to initialise [2010-04-08 13:46:31,755] DEBUG/10 hodRing:659 - jobtracker version : 20 [2010-04-08 13:46:31,756] DEBUG/10 hodRing:664 - jobtracker rpc server : n2:59664 [2010-04-08 13:46:31,757] DEBUG/10 hodRing:670 - Jobtracker jetty : n2:57775 [2010-04-08 13:46:32,042] DEBUG/10 hodRing:713 - Jetty gave a socket error. Sleeping for 0.5 [2010-04-08 13:46:33,544] DEBUG/10 hodRing:713 - Jetty gave a socket error. Sleeping for 1.0 [2010-04-08 13:46:35,545] DEBUG/10 hodRing:713 - Jetty gave a socket error. Sleeping for 2.0 [2010-04-08 13:46:38,546] DEBUG/10 hodRing:713 - Jetty gave a socket error. Sleeping for 4.0 [2010-04-08 13:46:43,547] DEBUG/10 hodRing:713 - Jetty gave a socket error. Sleeping for 8.0 [2010-04-08 13:46:52,548] DEBUG/10 hodRing:713 - Jetty gave a socket error. Sleeping for 16.0 4864033937778270/hdfs-nn/dfs-name'] [2010-04-08 13:47:08,552] CRITICAL/50 hodRing:723 - Jobtracker failed to initialise. *The log file hadoop.log in the actual compute node n0 has: * 2010-04-08 17:47:24,424 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /scratch/hod/mapredsys/zhang/mapredsystem/ 85.geronimo.gcl.cis.udel.edu/jobtracker.info could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) -------------------------------------------------------------------------------------------------- It looks like that hdfs daemon failed to start, so JT has no one to communicate with, then Jetty gave a error. I used hadoop0.20.2, Scyld OS, the cluster uses 0-5 (n0-n5) to refer to the back end compute node. Did anyone have this problem before? Any help will be appreciated. P.S. I have tmp files Jetty*** generated under /tmp on the compute nodes, but I set all the tmp dir to /home or /scratch, any idea? Here is my hod conf file: [hod] stream = True java-home =/usr cluster = geronimo cluster-factor = 1.8 xrs-port-range = 32768-65536 debug = 4 allocate-wait-time = 3600 temp-dir = /home/zhang/hodtmp.$PBS_JOBID [ringmaster] register = True stream = False temp-dir = /scratch/hod/ringmastertmp.$PBS_JOBID http-port-range = 8000-9000 work-dirs = /scratch/hod/tmp/1,/scratch/hod/tmp/2 xrs-port-range = 32768-65536 debug = 4 [hodring] stream = False temp-dir = /scratch/hod/hodringtmp.$PBS_JOBID register = True java-home = /usr http-port-range = 8000-9000 xrs-port-range = 32768-65536 debug = 4 mapred-system-dir-root = /scratch/hod/mapredsys [resource_manager] queue = batch batch-home = /usr id = torque env-vars = HOD_PYTHON_HOME=/opt/python/2.5.1/bin/python [gridservice-mapred] external = False pkgs = /home/zhang/hadoop-0.20.2 tracker_port = 8030 info_port = 50080 [gridservice-hdfs] external = False pkgs = /home/zhang/hadoop-0.20.2 fs_port = 8020 info_port = 50070 Thanks a lot!! Boyu