Hello! On Tue, May 21, 2013 at 05:51:40PM +0400, Dmitry Popov wrote:
> On Tue, 21 May 2013 17:23:08 +0400 > Maxim Dounin <[email protected]> wrote: > > > > > This is expected behaviour. Documentation is a bit simplified > > here, and fail_timeout is used like session time limit - the > > peer->fails counter is reset once there are no failures within > > fail_timeout. > > > > While this might be non-ideal for some use cases, it's certainly > > not a bug. > > > > Well, it really hurts. Upstreams which fail in ~1% of requests is not a rare > case, and we can't use max_fails+fail_timeout for them (because round-robin is > thrashed for them and ip_hash is completely useless). Moreover, it is very > hard > to debug because of wiki. Well, in normal world if an upstream constantly fails ~1% of requests - it's not healthy and should not be used. I understand that your use case is a bit special though. > > Such algorithm forget everything about previous failures once per > > fail_timeout, and won't detect bursts of failures split across > > two fail_timeout intervals. > > > > Consider: > > > > - max_fails = 3, fail_timeout = 10s > > - failure at 0s > > - failure at 9s > > - at 10s peer->fails counter is reset > > - failure at 11s > > - failure at 12s > > > > While 3 failures are only 3 seconds away from each other, this > > is not detected due to granularity introduced by the algorithm. > > Yes, I know this case, sorry, forgot to mention. However, I think it will > extend detection period to 2-3 fail_timeouts in real life (in theory up to > max_fails fail_timeouts, yes, but it's almost improbable). If we want correct > implementation we need per-second array (with fail_timeout elements), that's > an > overkill in my opinion. Sure, per-second array isn't a solution. > By the way, leaky bucket approach (like limit_req but > with fails per second) might work well here, what do you think? Yes, leaky/token bucket should work. That's actually what I think about if I think about changing the above algorithm to something strictly bound to fail_timeout period. -- Maxim Dounin http://nginx.org/en/donation.html _______________________________________________ nginx-devel mailing list [email protected] http://mailman.nginx.org/mailman/listinfo/nginx-devel
