Re: hadoop on demand setup: Failed to retrieve 'hdfs' service address

Boyu Zhang Thu, 08 Apr 2010 11:24:16 -0700

Hi Kevin,

I am having the same error, but my critical error is:


[2010-04-08 13:47:25,304] CRITICAL/50 hadoop:303 - Cluster could not be
allocated because of the following errors.
Hodring at n0 failed with following errors:
JobTracker failed to initialise

Have you solved this? Thanks!

Boyu

On Tue, Apr 6, 2010 at 11:32 AM, Kevin Van Workum <[email protected]>wrote:

> [sorry for the double posting (to general), but I think this list is
> the appropriate place for this message]
>
> Hello,
>
> I'm trying to setup hadoop on demand (HOD) on my cluster. I'm
> currently unable to "allocate cluster". I'm starting hod with the
> following command:
>
> /usr/local/hadoop-0.20.2/hod/bin/hod -c
> /usr/local/hadoop-0.20.2/hod/conf/hodrc -t
> /b/01/vanw/hod/hadoop-0.20.2.tar.gz -o "allocate ~/hod 3"
> --ringmaster.log-dir=/tmp -b 4
>
> The job starts on the nodes and I see the ringmaster running on the
> MotherSuperior. The ringmaster-main.log file is created and contains:
>
> [2010-04-06 11:18:29,036] DEBUG/10 ringMaster:487 - getServiceAddr
> service: <hodlib.GridServices.mapred.MapReduce instance at 0x12b42518>
> [2010-04-06 11:18:29,038] DEBUG/10 ringMaster:504 - getServiceAddr
> addr mapred: not found
> [2010-04-06 10:47:43,183] DEBUG/10 ringMaster:479 - getServiceAddr name:
> hdfs
> [2010-04-06 10:47:43,184] DEBUG/10 ringMaster:487 - getServiceAddr
> service: <hodlib.GridServices.hdfs.Hdfs instance at 0x122d24d0>
> [2010-04-06 10:47:43,186] DEBUG/10 ringMaster:504 - getServiceAddr
> addr hdfs: not found
>
> I don't see any associated processes running on the other 2 nodes in
> the job.
>
> The critical errors are as follows:
>
> [2010-04-06 10:34:13,630] CRITICAL/50 hadoop:298 - Failed to retrieve
> 'hdfs' service address.
> [2010-04-06 10:34:13,631] DEBUG/10 hadoop:631 - Cleaning up cluster id
> 238366.jman, as cluster could not be allocated.
> [2010-04-06 10:34:13,632] DEBUG/10 hadoop:635 - Calling rm.stop()
> [2010-04-06 10:34:13,639] DEBUG/10 hadoop:637 - Returning from rm.stop()
> [2010-04-06 10:34:13,639] CRITICAL/50 hod:401 - Cannot allocate
> cluster /b/01/vanw/hod
> [2010-04-06 10:34:14,149] DEBUG/10 hod:597 - return code: 7
>
> The contents of the hodrc file is:
>
> [hod]
> stream                          = True
> java-home                       = /usr/local/jdk1.6.0_02
> cluster                         = orange
> cluster-factor                  = 1.8
> xrs-port-range                  = 32768-65536
> debug                           = 4
> allocate-wait-time              = 3600
> temp-dir                        = /tmp/hod
>
> [ringmaster]
> register                        = True
> stream                          = False
> temp-dir                        = /tmp/hod
> http-port-range                 = 8000-9000
> work-dirs                       = /tmp/hod/1,/tmp/hod/2
> xrs-port-range                  = 32768-65536
> debug                           = 4
>
> [hodring]
> stream                          = False
> temp-dir                        = /tmp/hod
> register                        = True
> java-home                       = /usr/local/jdk1.6.0_02
> http-port-range                 = 8000-9000
> xrs-port-range                  = 32768-65536
> debug                           = 4
>
> [resource_manager]
> queue                           = dque
> batch-home                      = /usr/local/torque-2.3.7
> id                              = torque
> env-vars                       =
> HOD_PYTHON_HOME=/usr/local/python-2.5.5/bin/python
>
> [gridservice-mapred]
> external                        = False
> tracker_port                    = 8030
> info_port                       = 50080
>
> [gridservice-hdfs]
> external                        = False
> fs_port                         = 8020
> info_port                       = 50070
>
>
> Some other useful information:
> Linux 2.6.18-128.7.1.el5
> Python 2.5.5
> Twisted 10.0.0
> zope 3.3.0
> java version "1.6.0_02"
> hadoop version 0.20.2
>
>
>
> --
> Kevin Van Workum, PhD
> Sabalcore Computing Inc.
> Run your code on 500 processors.
> Sign up for a free trial account.
> www.sabalcore.com
> 877-492-8027 ext. 11
>

Re: hadoop on demand setup: Failed to retrieve 'hdfs' service address

Reply via email to