On Mar 3, 2011, at 2:51 AM, Steve Loughran wrote: > yes, but the problem is determining which one will fail. Ideally you should > find the route cause, which is often some race condition or hardware fault. > If it's the same server ever time, turn it off.
> You can play with the specex parameters, maybe change when they get kicked > off. The assumption in the code is that the slowness is caused by H/W > problems (especially HDD issues) and it tries to avoid duplicate work. If > every Map was duplicated, you'd be doubling the effective cost of each query, > and annoying everyone else in the cluster. Plus increased disk and network IO > might slow things down. > > Look at the options, have a play and see. If it doesn't have the feature, you > can always try coding it in -if the scheduler API lets it do it, you wont' be > breaking anyone else's code. > > -steve Thanks. I'll take it under consideration. In my case, it would be really beneficial to duplicate the work. That task in question is a single task on a single node (numerous mappers feed data into a single reducer), so duplicating the reducer represents very will duplicated effort while mitigating a potential bottleneck in the job's performance since the job simply is not done until the single reducer finishes. I would really like to be able to do what I am suggesting, to duplicate the reducer and kill the clones after the winner finishes. Anyway, thanks. ________________________________________________________________________________ Keith Wiley [email protected] keithwiley.com music.keithwiley.com "Luminous beings are we, not this crude matter." -- Yoda ________________________________________________________________________________
