Re: [PATCH] BUG/MEDIUM: Do not consider an agent check as failed on L7 error

2015-02-25 Thread Simon Horman
On Thu, Feb 26, 2015 at 07:09:25AM +0100, Willy Tarreau wrote:
 Hi Simon,
 
 On Thu, Feb 26, 2015 at 11:26:17AM +0900, Simon Horman wrote:
  As failure to connect to the agent check is not sufficient to mark it as
  failed it stands to reason that an L7 error shouldn't either.
  
  Without this fix if an L7 error occurs, for example of connectivity to the
  agent is lost immediately after establishing a connection to it, then the
  agent check will be considered to have failed and thus may end up with zero
  health. Once this has occurred if the primary health check also reaches
  zero health, which is likely if connectivity to the server is lost, then
  the server will be marked as down and not be marked as up again until a
  successful agent check occurs regardless of the success of any primary
  health checks.
  
  This behaviour is not correct as a failed agent check should never cause a
  server to be marked as down or by extension continue to be marked as down.
  
  Signed-off-by: Simon Horman ho...@verge.net.au
 
 Makes sense, thanks. Applied to both 1.6 and 1.5.

Thanks. It seems that at the very least this problem was reproducible
by rebooting backend servers.



Re: [PATCH] BUG/MEDIUM: Do not consider an agent check as failed on L7 error

2015-02-25 Thread Willy Tarreau
Hi Simon,

On Thu, Feb 26, 2015 at 11:26:17AM +0900, Simon Horman wrote:
 As failure to connect to the agent check is not sufficient to mark it as
 failed it stands to reason that an L7 error shouldn't either.
 
 Without this fix if an L7 error occurs, for example of connectivity to the
 agent is lost immediately after establishing a connection to it, then the
 agent check will be considered to have failed and thus may end up with zero
 health. Once this has occurred if the primary health check also reaches
 zero health, which is likely if connectivity to the server is lost, then
 the server will be marked as down and not be marked as up again until a
 successful agent check occurs regardless of the success of any primary
 health checks.
 
 This behaviour is not correct as a failed agent check should never cause a
 server to be marked as down or by extension continue to be marked as down.
 
 Signed-off-by: Simon Horman ho...@verge.net.au

Makes sense, thanks. Applied to both 1.6 and 1.5.

Best regards,
Willy