[sorry for the double posting (to general), but I think this list is the appropriate place for this message]
Hello, I'm trying to setup hadoop on demand (HOD) on my cluster. I'm currently unable to "allocate cluster". I'm starting hod with the following command: /usr/local/hadoop-0.20.2/hod/bin/hod -c /usr/local/hadoop-0.20.2/hod/conf/hodrc -t /b/01/vanw/hod/hadoop-0.20.2.tar.gz -o "allocate ~/hod 3" --ringmaster.log-dir=/tmp -b 4 The job starts on the nodes and I see the ringmaster running on the MotherSuperior. The ringmaster-main.log file is created and contains: [2010-04-06 11:18:29,036] DEBUG/10 ringMaster:487 - getServiceAddr service: <hodlib.GridServices.mapred.MapReduce instance at 0x12b42518> [2010-04-06 11:18:29,038] DEBUG/10 ringMaster:504 - getServiceAddr addr mapred: not found [2010-04-06 10:47:43,183] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-04-06 10:47:43,184] DEBUG/10 ringMaster:487 - getServiceAddr service: <hodlib.GridServices.hdfs.Hdfs instance at 0x122d24d0> [2010-04-06 10:47:43,186] DEBUG/10 ringMaster:504 - getServiceAddr addr hdfs: not found I don't see any associated processes running on the other 2 nodes in the job. The critical errors are as follows: [2010-04-06 10:34:13,630] CRITICAL/50 hadoop:298 - Failed to retrieve 'hdfs' service address. [2010-04-06 10:34:13,631] DEBUG/10 hadoop:631 - Cleaning up cluster id 238366.jman, as cluster could not be allocated. [2010-04-06 10:34:13,632] DEBUG/10 hadoop:635 - Calling rm.stop() [2010-04-06 10:34:13,639] DEBUG/10 hadoop:637 - Returning from rm.stop() [2010-04-06 10:34:13,639] CRITICAL/50 hod:401 - Cannot allocate cluster /b/01/vanw/hod [2010-04-06 10:34:14,149] DEBUG/10 hod:597 - return code: 7 The contents of the hodrc file is: [hod] stream = True java-home = /usr/local/jdk1.6.0_02 cluster = orange cluster-factor = 1.8 xrs-port-range = 32768-65536 debug = 4 allocate-wait-time = 3600 temp-dir = /tmp/hod [ringmaster] register = True stream = False temp-dir = /tmp/hod http-port-range = 8000-9000 work-dirs = /tmp/hod/1,/tmp/hod/2 xrs-port-range = 32768-65536 debug = 4 [hodring] stream = False temp-dir = /tmp/hod register = True java-home = /usr/local/jdk1.6.0_02 http-port-range = 8000-9000 xrs-port-range = 32768-65536 debug = 4 [resource_manager] queue = dque batch-home = /usr/local/torque-2.3.7 id = torque env-vars = HOD_PYTHON_HOME=/usr/local/python-2.5.5/bin/python [gridservice-mapred] external = False tracker_port = 8030 info_port = 50080 [gridservice-hdfs] external = False fs_port = 8020 info_port = 50070 Some other useful information: Linux 2.6.18-128.7.1.el5 Python 2.5.5 Twisted 10.0.0 zope 3.3.0 java version "1.6.0_02" hadoop version 0.20.2 -- Kevin Van Workum, PhD Sabalcore Computing Inc. Run your code on 500 processors. Sign up for a free trial account. www.sabalcore.com 877-492-8027 ext. 11
