RE: Configuration and Hadoop cluster setup

Hairong Kuang Fri, 25 May 2007 14:03:46 -0700

Have you tried Mahadev's suggestion? You need to set HADOOP_CONF_DIR to be
the directory in which your hadoop-site.xml is located at, or try to use
hadoop --config <conf_dir> to submit your job.


Hairong 

-----Original Message-----
From: Phantom [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 25, 2007 1:37 PM
To: [email protected]; [EMAIL PROTECTED]
Subject: Re: Configuration and Hadoop cluster setup

Here is a copy of my hadoop-site.xml. What am I doing wrong ?

<configuration>
        <property>
                <name>fs.default.name</name>
                <value>dev030.sctm.com:9000</value>
        </property>

        <property>
                <name>dfs.name.dir</name>
                <value>/tmp/hadoop</value>
        </property>

        <property>
                <name>mapred.job.tracker</name>
                <value>dev030.sctm.com:50029</value>
        </property>

        <property>
                <name>mapred.job.tracker.info.port</name>
                <value>50030</value>
        </property>

        <property>
                <name>mapred.min.split.size</name>
                <value>65536</value>
        </property>

        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>

</configuration>


On 5/25/07, Vishal Shah <[EMAIL PROTECTED]> wrote:
>
> Hi Avinash,
>
>   Can you share your hadoop-site.xml, mapred-default.xml and slaves files?
> Most probably, you have not set the jobtracker properly in the 
> hadoop-site.xml conf file. Check the property mapred.job.tracker 
> property in your file. It should look something like this:
>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>fully.qualified.domainname:40000</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
>
> -vishal.
>
> -----Original Message-----
> From: Mahadev Konar [mailto:[EMAIL PROTECTED]
> Sent: Friday, May 25, 2007 5:54 AM
> To: [email protected]
> Subject: RE: Configuration and Hadoop cluster setup
>
> Hi,
>   When you run the job, you need to set the environment variable 
> HADOOP_CONF_DIR to the configuration directory that has the 
> configuration file pointing to the right jobtracker.
>
> Regards
> Mahadev
>
> > -----Original Message-----
> > From: Phantom [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, May 24, 2007 4:51 PM
> > To: [email protected]
> > Subject: Re: Configuration and Hadoop cluster setup
> >
> > Yes the files are the same and I am starting the tasks on the 
> > namenode server. I also figured what my problem was with respect to 
> > not being
> able
> > to
> > start the namenode and job tracker on the same machine. I had to
> reformat
> > the file system. But the all this still doesn't cause the WordCount
> sample
> > to run in a distributed fashion. I can tell this because the 
> > LocalJobRunner is being used. Do I need to specify the config file 
> > to the running instance of the program ? If so how do I do that ?
> >
> > Thanks
> > A
> >
> > On 5/24/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
> > >
> > >
> > >
> > > Phantom wrote:
> > > > I am trying to run Hadoop on a cluster of 3 nodes. The namenode 
> > > > and
> > the
> > > > jobtracker web UI work. I have the namenode running on node A 
> > > > and
> job
> > > > tracker running on node B. Is it true that namenode and 
> > > > jobtracker
> > > cannot
> > > > run on the same box ?
> > >
> > > The namenode and the jobtracker can most definitely run on the 
> > > same
> box.
> > >   As far as I know this is the preferred configuration.
> > >
> > > Also if I want to run the examples on the cluster is
> > > > there anything special that needs to be done. When I run the 
> > > > example WordCount on a machine C (which is a task tracker and 
> > > > not a job
> > tracker)
> > > > the
> > > > LocalJobRunner is invoked all the time. I am guessing this means
> that
> > > the
> > > > map tasks are running locally. How can I distribute this on the
> > cluster
> > > ?
> > > > Please advice.
> > >
> > > Are the conf files on machine C the same as the namenode/jobtracker?
> > > Are they pointing to the namenode and jobtracker or are they 
> > > pointing
> to
> > > local in the hadoop-site.xml file.  Also we have found it easier 
> > > (although not necessarily better) to start tasks on the namenode
> server.
> > >
> > > It would be helpful to have more information about what is 
> > > happening
> and
> > > your setup as that would help myself and others on the list debug 
> > > what may be occurring.
> > >
> > > Dennis Kubes
> > >
> > > >
> > > > Thanks
> > > > Avinash
> > > >
> > >
>
>

RE: Configuration and Hadoop cluster setup

Reply via email to