Re: Speculative execution

Steve Loughran Thu, 03 Mar 2011 02:52:33 -0800

On 02/03/11 21:01, Keith Wiley wrote:

I realize that the intended purpose of speculative execution is to overcome individual 
slow tasks...and I have read that it explicitly is *not* intended to start copies of a 
task simultaneously and to then race them, but rather to start copies of tasks that 
"seem slow" after running for a while.


...but aside from merely being slow, sometimes tasks arbitrarily fail, and not 
in data-driven or otherwise deterministic ways.  A task may fail and then 
succeed on a subsequent attempt...but the total job time is extended by the 
time wasted during the initial failed task attempt.

yes, but the problem is determining which one will fail. Ideally youshould find the route cause, which is often some race condition orhardware fault. If it's the same server ever time, turn it off.


It would super-swell to run copies of a task simultaneously from the starting line and simply kill 
the copies after the winner finishes.  While is is "wasteful" in some sense (that is the 
argument offered for not running speculative execution this way to begin with), it would more 
precise to say that different users may have different priorities under various use-case scenarios. 
 The "wasting" of duplicate tasks on extra cores may be an acceptable cost toward the 
higher priority of minimizing job times for a given application.

Is there any notion of this in Hadoop?

You can play with the specex parameters, maybe change when they getkicked off. The assumption in the code is that the slowness is caused byH/W problems (especially HDD issues) and it tries to avoid duplicatework. If every Map was duplicated, you'd be doubling the effective costof each query, and annoying everyone else in the cluster. Plus increaseddisk and network IO might slow things down.

Look at the options, have a play and see. If it doesn't have thefeature, you can always try coding it in -if the scheduler API lets itdo it, you wont' be breaking anyone else's code.


-steve

Re: Speculative execution

Reply via email to