shinrich edited a comment on issue #7290:
URL: https://github.com/apache/trafficserver/issues/7290#issuecomment-819885851


   Did some more work today on down servers in our environment.  
   
   One think I hadn't noticed before was that an origin failure only 
contributes to the down server count if the 
t_state->current.server->connect_result is non-zero.   That is a real error 
happened during the TCP/TLS connection failure.  There are many messages 
generated in error.log where the connect_result is 0 and a failure happened 
between connect open and  first byte from server.  These transactions are 
available for retries, but they don't contribute to the counts to marking a 
server down.
   
   Locally, we are trying a built that only adds a log to error.log if it 
really is a connect failure.  It really cut down the noise in our logs.
   
   Once we remove the noise, we see the following cases for origin connection 
failure in our environment
   
   ENET_SSL_CONNECT_FAILED - I added this in the case of ERROR_SSL_ERROR in the 
TLS handshake negotiation.  It seems for us this is mostly due to server cert 
verification failure.
   
   Connection timed out [110] - A time out during the handshake
   
   No route to host [113] - The DNS entry for the origin is still there, but 
the machine has been decommissioned.
   
   Connection refused [111] - Presumably the service is down
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to