Harsh, This may be similar to the case I was observing, when no tasks were created on my 2nd node. It was the simple 'grep' example with only 6 input files. -steve
On Thu, Aug 2, 2012 at 7:13 AM, Harsh J <ha...@cloudera.com> wrote: > If you are speaking of Reduce tasks, then schedulers do try to spread them > across (by attempting to schedule only 1 reduce task per heartbeat/round). > > If you speak of data distribution equality, that depends on your data > (remember that reducers process key groups, and keys can carry skews > depending on input data IRL) and your partitioner > implementation/partitioning needs. > > > On Thu, Aug 2, 2012 at 4:35 PM, Saurabh Bajaj > <saurabh.ba...@mu-sigma.com>wrote: > >> Okay. One more question.**** >> >> Can we somehow make sure that all the reducers are equally split across >> all the nodes? **** >> >> ** ** >> >> Saurabh**** >> >> ** ** >> >> *From:* Harsh J [mailto:ha...@cloudera.com] >> *Sent:* Thursday, August 02, 2012 4:05 PM >> *To:* mapreduce-user@hadoop.apache.org >> *Subject:* Re: All reducers are not being utilized**** >> >> ** ** >> >> Saurabh,**** >> >> ** ** >> >> I do not see you talk about defining a custom Partitioner that can >> guarantee such perfect key distribution. The default partitioner is the >> HashPartitioner that can only guarantee randomized distribution (as it is >> key data specific). Hence, your test here with just 3 keys is not really a >> good way to test key distribution to reducers with a HashPartition. Try it >> out with a large data set to see for real.**** >> >> On Thu, Aug 2, 2012 at 3:55 PM, Saurabh Bajaj <saurabh.ba...@mu-sigma.com> >> wrote:**** >> >> Hi everyone, **** >> >> **** >> >> I was running a MR job in java and this scenario happened:**** >> >> **** >> >> *Case 1:***** >> >> Number of distinct output keys from mapper = 3**** >> >> Expected # of reducers = 3**** >> >> Defined set # of reducers to be called = 2**** >> >> *Expected outcome:***** >> >> # of reducers spawned = 2**** >> >> # of keys processed under first reducer = 1**** >> >> # of keys processed under second reducer = 2**** >> >> *Observed outcome:***** >> >> # of keys processed under first reducer = 3**** >> >> # of keys processed under second reducer = 0**** >> >> **** >> >> **** >> >> **** >> >> *Case 2:***** >> >> Number of distinct output keys from mapper = 3**** >> >> Expected # of reducers = 3**** >> >> Defined set # of reducers to be called = 3**** >> >> *Expected outcome:***** >> >> # of reducers spawned = 3**** >> >> # of keys processed under first reducer = 1**** >> >> # of keys processed under second reducer = 1**** >> >> # of keys processed under third reducer = 1**** >> >> * ***** >> >> *Observed outcome:***** >> >> # of reducers spawned = 3**** >> >> # of keys processed under first reducer = 2**** >> >> # of keys processed under second reducer = 0**** >> >> # of keys processed under third reducer = 1**** >> >> **** >> >> **** >> >> Any idea why all the reducers are not utilized?**** >> >> **** >> >> Saurabh Bajaj *|* Senior Business Analyst *|* +91 9986588089 *|* >> www.mu-sigma.com* **|***** >> >> **** >> >> ** ** >> ------------------------------ >> >> This email message may contain proprietary, private and confidential >> information. The information transmitted is intended only for the person(s) >> or entities to which it is addressed. Any review, retransmission, >> dissemination or other use of, or taking of any action in reliance upon, >> this information by persons or entities other than the intended recipient >> is prohibited and may be illegal. If you received this in error, please >> contact the sender and delete the message from your system. >> >> Mu Sigma takes all reasonable steps to ensure that its electronic >> communications are free from viruses. However, given Internet >> accessibility, the Company cannot accept liability for any virus introduced >> by this e-mail or any attachment and you are advised to use up-to-date >> virus checking software.**** >> >> >> >> **** >> >> ** ** >> >> -- >> Harsh J**** >> >> ------------------------------ >> This email message may contain proprietary, private and confidential >> information. The information transmitted is intended only for the person(s) >> or entities to which it is addressed. Any review, retransmission, >> dissemination or other use of, or taking of any action in reliance upon, >> this information by persons or entities other than the intended recipient >> is prohibited and may be illegal. If you received this in error, please >> contact the sender and delete the message from your system. >> >> Mu Sigma takes all reasonable steps to ensure that its electronic >> communications are free from viruses. However, given Internet >> accessibility, the Company cannot accept liability for any virus introduced >> by this e-mail or any attachment and you are advised to use up-to-date >> virus checking software. >> > > > > -- > Harsh J > -- Steve Sonnenberg