Thank You All. Even I have noticed this strange behavior some time back. Now my inital concern still remains. If I provide my input directory an empty one, yes the map tasks wont be executed .But my reducer needs input to do the processing/ aggregation. In such a scenario, is there an option to provide input just to the reducer?
Regards Bejoy.K.S On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <sudha...@gmail.com>wrote: > This is true and it took as off by surprise in recent past. Also, it had > quite some impact on our job cycles where the size of input is totally > random and could also be zero at times. > > In one of our cycles, we run a lot of jobs. Say we configure X as the num > of reducers for a job which does not have any input. > > Y -> No of tasktrackers in the cluster > > H -> Time Interval for Heartbeat response > > With the cdh2 version, the job takes, > > ( X / Y) * H seconds to complete without doing any work since we assign > only one reduce task per heartbeat > > > If the number of such jobs in the cycle is more, then the total time that > the cluster spends doing nothing accumulates. > > I was thinking of raising this as a jira but not sure. Should we raise and > fix this as jira request? Num of reducers set by the client can be overriden > if the number of mappers is 0? > > We have a way to hack, by verifying the existence of the input path to the > Map phase ourselves but just thought would be more intuitive for the > framework to handle itself > > -Sudhan S > > On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com> wrote: > >> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a >> job ;-) >> >> /me puts his troll-mask on. >> >> ➜ ~HADOOP_HOME hadoop fs -mkdir abc >> ➜ ~HADOOP_HOME hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount >> abc out >> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process >> : 0 >> 11/09/07 14:24:14 INFO mapred.JobClient: Running job: >> job_201109071413_0001 >> 11/09/07 14:24:15 INFO mapred.JobClient: map 0% reduce 0% >> 11/09/07 14:24:21 INFO mapred.JobClient: map 0% reduce 100% >> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete: >> job_201109071413_0001 >> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13 >> 11/09/07 14:24:22 INFO mapred.JobClient: Job Counters >> 11/09/07 14:24:22 INFO mapred.JobClient: Launched reduce tasks=1 >> 11/09/07 14:24:22 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=2209 >> 11/09/07 14:24:22 INFO mapred.JobClient: Total time spent by all >> reduces waiting after reserving slots (ms)=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: Total time spent by all >> maps waiting after reserving slots (ms)=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=3113 >> 11/09/07 14:24:22 INFO mapred.JobClient: FileSystemCounters >> 11/09/07 14:24:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=59220 >> 11/09/07 14:24:22 INFO mapred.JobClient: Map-Reduce Framework >> 11/09/07 14:24:22 INFO mapred.JobClient: Reduce input groups=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: Combine output records=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: Reduce shuffle bytes=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: Reduce output records=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: Spilled Records=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: Combine input records=0 >> 11/09/07 14:24:22 INFO mapred.JobClient: Reduce input records=0 >> >> /me takes off troll mask. >> >> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <bejoy.had...@gmail.com> wrote: >> > Thanks Sonal. I was just thinking of some weird design and wanted to >> make >> > sure whether there is a possibility like that- no maps and all reducers. >> > >> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <sonalgoy...@gmail.com> >> wrote: >> >> >> >> I dont think that is possible, can you explain in what scenario you >> want >> >> to have no mappers, only reducers? >> >> Best Regards, >> >> Sonal >> >> Crux: Reporting for HBase >> >> Nube Technologies >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <bejoy.had...@gmail.com> >> wrote: >> >>> >> >>> Hi >> >>> I'm having a query here. Is it possible to have no mappers >> but >> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers >> we can >> >>> set numReduceTasks to zero but such a setting on mapper wont work. So >> how >> >>> can it be achieved if possible? >> >>> >> >>> Thank You >> >>> >> >>> Regards >> >>> Bejoy.K.S >> >> >> > >> > >> >> >> >> -- >> Harsh J >> > >