Detailed instructions for configuring your setup to run across one or more machines are on the hadoop Wiki (http://wiki.apache.org/lucene-hadoop/GettingStartedWithHadoop) and in the Hadoop API Documentation (http://lucene.apache.org/hadoop/api/index.html: scroll down to the 'Getting Started' section). It's usually sufficient to add your settings to hadoop-site.xml, which should be in the conf directory. You don't need to touch the jar files. You can modify mapred-default.xml if you don't like the default settings for the number of mapper and reducer tasks, but they're usually sufficient for most purposes, esepcially when you're just getting things running.
> -----Original Message----- > From: Samuel LEMOINE [mailto:[EMAIL PROTECTED] > Sent: Thursday, July 26, 2007 8:28 AM > To: [email protected] > Subject: Configuring hadoop > > Hi all ! > > I'm working on hadoop and currently i'm using the examples > provided (WordCound & Grep especially). > I've managed to make those examples work on a local machine, > and now I'd like to go on to the next step: parallelization. > My rpoblem is that I don't know where to configure this. > According to hadoop wiki, the files mapred-default.xml and > hadoop-site.xml should be modified, but my actual problem is > that I don't know which ones of these files should be changed > ? I mean, do I have to unzip the hadoop-core jar, modify the > xml files, and zip it back ? Or these files maybe just need > to be present in the directory of the included jar ? > Sorry if the information was present somewhere else, I > haven't found it. > > Thanks in advance, > > Samuel >
