TS does not fail-over if one origin server for a 2 address hostname goes down
-----------------------------------------------------------------------------
Key: TS-750
URL: https://issues.apache.org/jira/browse/TS-750
Project: Traffic Server
Issue Type: Bug
Components: HTTP
Affects Versions: 2.1.4, 2.1.5, 2.1.6, 2.1.7
Environment: Any
Reporter: William Bardwell
If you have a hostname that looks up to 2 addresses, and you make a request to
TS for something at that hostname, and then kill the origin server at which
ever address TS just talked to, your next request (if done promptly) will fail
with a 502 status code. A request made after that will fail-over correctly.
Tracing the code I see it doing proxy.config.http.connect_attempts_max_retries
retries to the same address, and it does call code to mark the address down
after proxy.config.http.connect_attempts_rr_retries attempts, the address does
not get marked down.
(The code calls HttpTransact::delete_server_rr_entry() which does
TRANSACT_RETURN(OS_RR_MARK_DOWN, ReDNSRoundRobin) which in turns tries to set
up the marking with HTTP_SM_SET_DEFAULT_HANDLER(&HttpSM::state_mark_os_down),
but state_mark_os_down never actually happens, instead it just goes into the
retry, I think based on ReDNSRoundRobin doing s->next_action =
how_to_open_connection(s).)
I have a fix, although it doesn't seem like quite the right way to go about
things, but I can't figure out how to get state_mark_os_down
to get called at the right time.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira