Thanks - it certainly helps! Ken
Arun C Murthy wrote: > > Hi Ken, > > On Sat, Oct 06, 2007 at 08:54:54PM -0700, Ken Pu wrote: >> >>Hi, >> >>As a beginner of Hadoop, I wonder how to send output key-value pairs of the >>reducers back to the input of mappers for iterative processing. >> > > A map-reduce job has only 1 set of maps and 1 set of reduces. > > The way to do what you seek would be to chain jobs together i.e. output of > job1 becomes input of job2 and so on. That is fairly easy since the output > of the job (i.e. reduces) is on hdfs, usually. > > Clearly the onus on waiting for job-completion is on the user-code i.e. > you have to ensure job1 is complete before launching job2 and so on... > > The way to do that would be: > a) > http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/JobClient.html#runJob(org.apache.hadoop.mapred.JobConf) > which submits the job and returns only after it completes (success or > failure). > b) > http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/JobClient.html#submitJob(org.apache.hadoop.mapred.JobConf) > to just submit the job and poll yourself, look at > src/java/org/apache/hadoop/mapred/JobClient.java and in particular the > implmentation of *runJob* on how to do taht. > b) If you don't want to poll use the *job.end.notification.url* property > where you can setup a url which will be invoked once the job completes to > do async-stuff. (Take a look at > src/test/org/apache/hadoop/mapred/NotificationTestCase.java for an e.g. on > how to use that). > >>What's hadoop streaming? Can I pipe the output stream of reducers back to >>the input stream of the mappers to achieve what I want? >> > > Hadoop streaming is a utility which allows the user to create and run > map/reduce jobs with any executables as the mapper and/or the reducer. > E.g. one can use std. unix utilities as the mapper/reducer > $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar > -input myInputDirs \ > -output myOutputDir \ > -mapper /bin/cat \ > -reducer /bin/wc > > Hope that helps. > > Arun > >>Any pointer would be greatly appreciated. >>-- >>View this message in context: http://www.nabble.com/Q%3A-Sending-output-of-reduce-to-mapper-tf4581957.html#a13079722 >>Sent from the Hadoop Users mailing list archive at Nabble.com. >> > > -- View this message in context: http://www.nabble.com/Q%3A-Sending-output-of-reduce-to-mapper-tf4581957.html#a13207007 Sent from the Hadoop Users mailing list archive at Nabble.com.
