Thanks for the feedback.  While not an issue in my case, a configuration 
parameter that limits the number of backends to try could be useful for others. 
 I don't know how most people use varnish, but potentially triggering vcl_error 
when a single backend shuts down is probably undesirable behavior for most 
users.

From: [email protected] 
[mailto:[email protected]] On Behalf Of Adrian Otto
Sent: Sunday, April 11, 2010 7:51 PM
To: [email protected]
Subject: Re: [PATCH] Random director tries all backends before giving up

Jack,

This approach is probably not a good idea if (a) you have a large cluster, (b) 
a heavily loaded cluster, and/or (c) if your backends are sensitive to 
overload. You are likely to trigger a cascading failure. It might be smarter to 
have a configurable number of backends to try... perhaps 2 or 3. Imagine if you 
have 50 backends. There is no point in trying 50 times to find a healthy 
backend. Changes are that if 25% of your backends are down, trying more is just 
going to exacerbate the problem.

Adrian

On Apr 11, 2010, at 4:35 PM, Jack Lindamood wrote:


The following is a patch I've made to varnish that I hope improves the random 
director: which anyone's welcome to use (even varnish trunk?).  My motivation 
was to reduce the number of vcl_error calls when a director is mostly good.  
You can get the entire patch at this link.

http://github.com/cep21/Varnish/commit/6f5e98143ac2636504d9febf574b14c3c1a072fc

Here's the commit message:

Random director tries all backends before giving up

Summary:
The current random director gives up when it can't get a FD to the backend it 
wants retries times in a row.  Rather than give up and return NULL, which is 
guaranteed to cause a vcl_error, as a last ditch effort we try all other 
healthy backends until we get one that works.  This is mostly useful in the 
between time after a backend server dies and before the health check fails 
enough to mark a backend unhealthy.

Backwards Compatibility =  Not strictly backwards compatible.  In cases when 
the old code would of fallen through to vcl_error this will give a shot at 
getting a good result.

Performance = In the worse case, this will add extra calls for getting a FD, 
but only for situations that vcl_error

Test Plan: New varnish unittest.  It fails in the old code and works in this 
new code.


_______________________________________________
varnish-dev mailing list
[email protected]<mailto:[email protected]>
http://lists.varnish-cache.org/mailman/listinfo/varnish-dev

_______________________________________________
varnish-dev mailing list
[email protected]
http://lists.varnish-cache.org/mailman/listinfo/varnish-dev

Reply via email to