On Thu, Mar 3, 2011 at 2:04 PM, Keith Wiley <[email protected]> wrote: > On Mar 3, 2011, at 2:51 AM, Steve Loughran wrote: > >> yes, but the problem is determining which one will fail. Ideally you should >> find the route cause, which is often some race condition or hardware fault. >> If it's the same server ever time, turn it off. > >> You can play with the specex parameters, maybe change when they get kicked >> off. The assumption in the code is that the slowness is caused by H/W >> problems (especially HDD issues) and it tries to avoid duplicate work. If >> every Map was duplicated, you'd be doubling the effective cost of each >> query, and annoying everyone else in the cluster. Plus increased disk and >> network IO might slow things down. >> >> Look at the options, have a play and see. If it doesn't have the feature, >> you can always try coding it in -if the scheduler API lets it do it, you >> wont' be breaking anyone else's code. >> >> -steve > > > Thanks. I'll take it under consideration. In my case, it would be really > beneficial to duplicate the work. That task in question is a single task on > a single node (numerous mappers feed data into a single reducer), so > duplicating the reducer represents very will duplicated effort while > mitigating a potential bottleneck in the job's performance since the job > simply is not done until the single reducer finishes. I would really like to > be able to do what I am suggesting, to duplicate the reducer and kill the > clones after the winner finishes. > > Anyway, thanks. >
What is your reason for needing a single reducer? I'd first try to see how I could parallelize that work first if possible. Jacob
