Re: Q: Sending output of reduce to mapper

Ken Pu Sun, 14 Oct 2007 22:43:17 -0700

Thanks - it certainly helps!

Ken




Arun C Murthy wrote:
> 
> Hi Ken,
> 
> On Sat, Oct 06, 2007 at 08:54:54PM -0700, Ken Pu wrote:
>>
>>Hi,
>>
>>As a beginner of Hadoop, I wonder how to send output key-value pairs of
the
>>reducers back to the input of mappers for iterative processing.  
>>
> 
> A map-reduce job has only 1 set of maps and 1 set of reduces.
> 
> The way to do what you seek would be to chain jobs together i.e. output of
> job1 becomes input of job2 and so on. That is fairly easy since the output
> of the job (i.e. reduces) is on hdfs, usually.
> 
> Clearly the onus on waiting for job-completion is on the user-code i.e.
> you have to ensure job1 is complete before launching job2 and so on...
> 
> The way to do that would be: 
> a)
> http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/JobClient.html#runJob(org.apache.hadoop.mapred.JobConf)
> which submits the job and returns only after it completes (success or
> failure).
> b)
> http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/JobClient.html#submitJob(org.apache.hadoop.mapred.JobConf)
> to just submit the job and poll yourself, look at
> src/java/org/apache/hadoop/mapred/JobClient.java and in particular the
> implmentation of *runJob* on how to do taht.
> b) If you don't want to poll use the *job.end.notification.url* property
> where you can setup a url which will be invoked once the job completes to
> do async-stuff. (Take a look at
> src/test/org/apache/hadoop/mapred/NotificationTestCase.java for an e.g. on
> how to use that).
> 
>>What's hadoop streaming?  Can I pipe the output stream of reducers back to
>>the input stream of the mappers to achieve what I want?
>>
> 
> Hadoop streaming is a utility which allows the user to create and run
> map/reduce jobs with any executables as the mapper and/or the reducer. 
> E.g. one can use std. unix utilities as the mapper/reducer
> $HADOOP_HOME/bin/hadoop  jar $HADOOP_HOME/hadoop-streaming.jar
>   -input myInputDirs \
>   -output myOutputDir \
>   -mapper /bin/cat \
>   -reducer /bin/wc
> 
> Hope that helps.
> 
> Arun
> 
>>Any pointer would be greatly appreciated.
>>-- 
>>View this message in context:
http://www.nabble.com/Q%3A-Sending-output-of-reduce-to-mapper-tf4581957.html#a13079722
>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Q%3A-Sending-output-of-reduce-to-mapper-tf4581957.html#a13207007
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Q: Sending output of reduce to mapper

Reply via email to