Sorry for the previous incomplete message.
Here is the take 2:

When I use a Replicated Join only 2 map tasks get scheduled (compared to
100+ tasks for the other steps)
What is the idea behind this? What setting do I use to override this
behaviour?


Also, a basic question.
Does hadoop decide the map task capacity or it simply follows the
configuration?

Map Task Capacity Reduce Task Capacity Avg. Tasks/Node Blacklisted Nodes
Excluded Nodes
 64                         20                             1.00

Thanks, Prashant.

Reply via email to