On Mar 3, 2011, at 2:51 AM, Steve Loughran wrote:

> yes, but the problem is determining which one will fail. Ideally you should 
> find the route cause, which is often some race condition or hardware fault. 
> If it's the same server ever time, turn it off.

> You can play with the specex parameters, maybe change when they get kicked 
> off. The assumption in the code is that the slowness is caused by H/W 
> problems (especially HDD issues) and it tries to avoid duplicate work. If 
> every Map was duplicated, you'd be doubling the effective cost of each query, 
> and annoying everyone else in the cluster. Plus increased disk and network IO 
> might slow things down.
> 
> Look at the options, have a play and see. If it doesn't have the feature, you 
> can always try coding it in -if the scheduler API lets it do it, you wont' be 
> breaking anyone else's code.
> 
> -steve


Thanks.  I'll take it under consideration.  In my case, it would be really 
beneficial to duplicate the work.  That task in question is a single task on a 
single node (numerous mappers feed data into a single reducer), so duplicating 
the reducer represents very will duplicated effort while mitigating a potential 
bottleneck in the job's performance since the job simply is not done until the 
single reducer finishes.  I would really like to be able to do what I am 
suggesting, to duplicate the reducer and kill the clones after the winner 
finishes.

Anyway, thanks.

________________________________________________________________________________
Keith Wiley     [email protected]     keithwiley.com    music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda
________________________________________________________________________________

Reply via email to