Hi Kevin, I am having the same error, but my critical error is:
[2010-04-08 13:47:25,304] CRITICAL/50 hadoop:303 - Cluster could not be allocated because of the following errors. Hodring at n0 failed with following errors: JobTracker failed to initialise Have you solved this? Thanks! Boyu On Tue, Apr 6, 2010 at 11:32 AM, Kevin Van Workum <[email protected]>wrote: > [sorry for the double posting (to general), but I think this list is > the appropriate place for this message] > > Hello, > > I'm trying to setup hadoop on demand (HOD) on my cluster. I'm > currently unable to "allocate cluster". I'm starting hod with the > following command: > > /usr/local/hadoop-0.20.2/hod/bin/hod -c > /usr/local/hadoop-0.20.2/hod/conf/hodrc -t > /b/01/vanw/hod/hadoop-0.20.2.tar.gz -o "allocate ~/hod 3" > --ringmaster.log-dir=/tmp -b 4 > > The job starts on the nodes and I see the ringmaster running on the > MotherSuperior. The ringmaster-main.log file is created and contains: > > [2010-04-06 11:18:29,036] DEBUG/10 ringMaster:487 - getServiceAddr > service: <hodlib.GridServices.mapred.MapReduce instance at 0x12b42518> > [2010-04-06 11:18:29,038] DEBUG/10 ringMaster:504 - getServiceAddr > addr mapred: not found > [2010-04-06 10:47:43,183] DEBUG/10 ringMaster:479 - getServiceAddr name: > hdfs > [2010-04-06 10:47:43,184] DEBUG/10 ringMaster:487 - getServiceAddr > service: <hodlib.GridServices.hdfs.Hdfs instance at 0x122d24d0> > [2010-04-06 10:47:43,186] DEBUG/10 ringMaster:504 - getServiceAddr > addr hdfs: not found > > I don't see any associated processes running on the other 2 nodes in > the job. > > The critical errors are as follows: > > [2010-04-06 10:34:13,630] CRITICAL/50 hadoop:298 - Failed to retrieve > 'hdfs' service address. > [2010-04-06 10:34:13,631] DEBUG/10 hadoop:631 - Cleaning up cluster id > 238366.jman, as cluster could not be allocated. > [2010-04-06 10:34:13,632] DEBUG/10 hadoop:635 - Calling rm.stop() > [2010-04-06 10:34:13,639] DEBUG/10 hadoop:637 - Returning from rm.stop() > [2010-04-06 10:34:13,639] CRITICAL/50 hod:401 - Cannot allocate > cluster /b/01/vanw/hod > [2010-04-06 10:34:14,149] DEBUG/10 hod:597 - return code: 7 > > The contents of the hodrc file is: > > [hod] > stream = True > java-home = /usr/local/jdk1.6.0_02 > cluster = orange > cluster-factor = 1.8 > xrs-port-range = 32768-65536 > debug = 4 > allocate-wait-time = 3600 > temp-dir = /tmp/hod > > [ringmaster] > register = True > stream = False > temp-dir = /tmp/hod > http-port-range = 8000-9000 > work-dirs = /tmp/hod/1,/tmp/hod/2 > xrs-port-range = 32768-65536 > debug = 4 > > [hodring] > stream = False > temp-dir = /tmp/hod > register = True > java-home = /usr/local/jdk1.6.0_02 > http-port-range = 8000-9000 > xrs-port-range = 32768-65536 > debug = 4 > > [resource_manager] > queue = dque > batch-home = /usr/local/torque-2.3.7 > id = torque > env-vars = > HOD_PYTHON_HOME=/usr/local/python-2.5.5/bin/python > > [gridservice-mapred] > external = False > tracker_port = 8030 > info_port = 50080 > > [gridservice-hdfs] > external = False > fs_port = 8020 > info_port = 50070 > > > Some other useful information: > Linux 2.6.18-128.7.1.el5 > Python 2.5.5 > Twisted 10.0.0 > zope 3.3.0 > java version "1.6.0_02" > hadoop version 0.20.2 > > > > -- > Kevin Van Workum, PhD > Sabalcore Computing Inc. > Run your code on 500 processors. > Sign up for a free trial account. > www.sabalcore.com > 877-492-8027 ext. 11 >
