Re: No Mapper but Reducer

Harsh J Wed, 07 Sep 2011 04:40:47 -0700

Nope. A reducer's input is from the map outputs alone (fetched in by
the shuffling code), which would not exist here.


What are you looking to do? Why won't a map task suffice for doing that?

On Wed, Sep 7, 2011 at 4:51 PM, Bejoy KS <bejoy.had...@gmail.com> wrote:
> Thank You All. Even I have noticed this strange behavior some time back.
> Now my inital concern still remains.  If I provide my input directory an
> empty one, yes the map tasks wont be executed .But my reducer needs  input
> to do the processing/ aggregation. In such a scenario, is there an option to
> provide input just to the reducer?
>
> Regards
> Bejoy.K.S
>
> On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <sudha...@gmail.com>
> wrote:
>>
>> This is true and it took as off by surprise in recent past. Also, it had
>> quite some impact on our job cycles where the size of input is totally
>> random and could also be zero at times.
>> In one of our cycles, we run a lot of jobs. Say we configure X as the num
>> of reducers for a job which does not have any input.
>> Y -> No of tasktrackers in the cluster
>> H -> Time Interval for Heartbeat response
>> With the cdh2 version, the job takes,
>> ( X / Y) * H seconds to complete without doing any work since we assign
>> only one reduce task per heartbeat
>>
>> If the number of such jobs in the cycle is more, then the total time that
>> the cluster spends doing nothing accumulates.
>> I was thinking of raising this as a jira but not sure. Should we raise and
>> fix this as jira request? Num of reducers set by the client can be overriden
>> if the number of mappers is 0?
>> We have a way to hack, by verifying the existence of the input path to the
>> Map phase ourselves but just thought would be more intuitive for the
>> framework to handle itself
>> -Sudhan S
>> On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a
>>> job ;-)
>>>
>>> /me puts his troll-mask on.
>>>
>>> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
>>> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount
>>> abc out
>>> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to
>>> process : 0
>>> 11/09/07 14:24:14 INFO mapred.JobClient: Running job:
>>> job_201109071413_0001
>>> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
>>> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
>>> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete:
>>> job_201109071413_0001
>>> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
>>> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>>> reduces waiting after reserving slots (ms)=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>>> maps waiting after reserving slots (ms)=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
>>> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
>>> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
>>>
>>> /me takes off troll mask.
>>>
>>> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <bejoy.had...@gmail.com> wrote:
>>> > Thanks Sonal. I was just thinking of some weird design and wanted to
>>> > make
>>> > sure whether there is a possibility like that- no maps and all
>>> > reducers.
>>> >
>>> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <sonalgoy...@gmail.com>
>>> > wrote:
>>> >>
>>> >> I dont think that is possible, can you explain in what scenario you
>>> >> want
>>> >> to have no mappers, only reducers?
>>> >> Best Regards,
>>> >> Sonal
>>> >> Crux: Reporting for HBase
>>> >> Nube Technologies
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <bejoy.had...@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> Hi
>>> >>>           I'm having a query here. Is it possible to have no mappers
>>> >>> but
>>> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers
>>> >>> we can
>>> >>> set numReduceTasks to zero but such a setting on mapper wont work. So
>>> >>> how
>>> >>> can it be achieved if possible?
>>> >>>
>>> >>> Thank You
>>> >>>
>>> >>> Regards
>>> >>> Bejoy.K.S
>>> >>
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>
>
>



-- 
Harsh J

Re: No Mapper but Reducer

Reply via email to