Jeremy,

A clarification: there is currently no mechanism in Hadoop to slot
particular tasks on particular nodes. Hadoop does not take into account a
particular node's suitability for a given task; if one node has more CPU,
and another node has more IO, you cannot indicate that certain tasks should
be done on the CPU-intense nodes, and others on the IO-intense nodes.

Speculative execution, though, means that any tasks which are "left behind"
near the end of a job will be re-executed in parallel on multiple other
"empty" nodes which are waiting for the full job to complete. Hopefully,
it'll also pick a "correct" node for the task via this secondary random
placement, if it didn't do it in the first apportioning of jobs. By default,
I think map task speculation is enabled, but reduce task speculation is
disabled.

- Aaron

On Wed, Dec 24, 2008 at 1:12 AM, Devaraj Das <[email protected]> wrote:

> You can enable speculative execution for your jobs.
>
>
> On 12/24/08 10:25 AM, "Jeremy Chow" <[email protected]> wrote:
>
> > Hi list,
> > I've come up against a scenario like this,  to finish a same task, one of
> my
> > hadoop cluster only needs 5 seconds, and another one needs more than 2
> > minutes.
> > It's a common phenomenon that will decrease the parallelism of our system
> > due to the faster one will wait the slower one. How to coordinate those
> > nodes of different computing powers in a same cluster?
> >
> > Thanks,
> > Jeremy
>
>
>

Reply via email to