On Aug 18, 2011, at 12:28 AM, Owen O'Malley wrote:
> 
> This vote is still running with no votes other than mine. 
> 
> I've tested with and without security on a 60 node cluster and I'm seeing 
> some failures, but not that many. On a terasort with 15,000 maps and 200 
> reduces, I ran the following cases:
> 
> security + linux task controller : 2 failures (both mr-2651)
> 
> no security + default task controller : 6-7 failures (seems to be a race 
> condition in clean up)
> 
> Even in the no security case, it is only losing 0.05% of the time.

        We're seeing much much higher failure rates.  In the 5-10% area.  It 
might very well be because we have more cores/faster boxes.

> It isn't perfect, but this is the code that Yahoo is currently running. I 
> think we should release it.

        Y! can afford the task failures.  The rest of us can't.  So -1.

Reply via email to