RE: Configuration and Hadoop cluster setup

Mahadev Konar Tue, 29 May 2007 13:15:14 -0700

Hi Avinash,
  The way Map Reduce in distributed environment works is:
1) set up the cluster in distributed fashion as specified in the wiki.
http://wiki.apache.org/lucene-hadoop/GettingStartedWithHadoop
2) run mapreduce jobs with the command:
Bin/hadoop jar job.jar
Before doing this you need to set the HADOOP_CONF_DIR env variable pointing
to the conf directory that contains the distributed configuration.


The input files need to be uploaded to HDFS first and then in your jobconf
you need to set job.setInputPath(tempDir); -- where tempdir is the
inputdirectory for the mapreduce job and the directory where you uploaded
the files. You can take a look at the examples in Hadoop examples directory
for this.
Hope this helps.

Regards
Mahadev

> -----Original Message-----
> From: Phantom [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, May 29, 2007 11:53 AM
> To: [email protected]
> Subject: Re: Configuration and Hadoop cluster setup
> 
> Either I am totally confused or this configuration stuff is confusing the
> hell out of me. I am pretty sure it is the former. Please I am looking for
> advice here as to how I should do this. I have my fs.default.name set to
> hdfs://<host>:<port>. In my JobConf setup I set the set same value for my
> fs.default.name. Now I have two options and I would appreciate if some
> expert could tell me which option I should take and why ?
> 
> (1) Set my fs.default.name set to hdfs://<host>:<port> and also specify it
> in the JobConf configuration. Copy my sample input file into HDFS using
> "bin/hadoop fd -put" from my local file system. I then need to specify
> this
> file to my WordCount sample as input. Should I specify this file with the
> hdfs:// directive ?
> 
> (2) Set my fs.default.name set to file://<host>:<port> and also specify it
> in the JobConf configuration. Just specify the input path to the WordCount
> sample and everything should work if the path is available to all machines
> in the cluster ?
> 
> Which way should I go ?
> 
> Thanks
> Avinash
> 
> On 5/29/07, Phantom <[EMAIL PROTECTED]> wrote:
> >
> > Yes it is.
> >
> > Thanks
> > A
> >
> >
> > On 5/29/07, Doug Cutting <[EMAIL PROTECTED]> wrote:
> > >
> > > Phantom wrote:
> > > > Is there a workaround ? I want to run the WordCount sample against a
> > > > file on
> > > > my local filesystem. If this is not possible do I need to put my
> file
> > > into
> > > > HDFS and then point that location to my program ?
> > >
> > > Is your local filesystem accessible to all nodes in your system?
> > >
> > > Doug
> > >
> >
> >

RE: Configuration and Hadoop cluster setup

Reply via email to