Dimitry Andric created TS-2481:
----------------------------------
Summary: Incorrect origin server port used sometimes (with
keep-alive)
Key: TS-2481
URL: https://issues.apache.org/jira/browse/TS-2481
Project: Traffic Server
Issue Type: Bug
Components: HTTP
Reporter: Dimitry Andric
After the fix for TS-1962, there are some scenarios where Traffic Server can
decide to reuse an origin server connection (because of keep-alive), even when
the destination port is different. I encountered this with 4.0.2, though this
bug is still in trunk, but it is masked by another fix.
When using Traffic Server 4.0.2 as a transparent proxy, two requests that are
made to the same origin server but on different ports, may be sent to the same
port.
All GETs work as expected:
* GET to origin:80 arrives at origin:80
* GET to origin:80, followed by GET to origin:9999 arrive at origin:80 and
origin:9999 respectively
* GET to origin:9999, followed by GET to origin:80 arrive at origin:9999 and
origin:80 respectively
But stuff gets weird after a POST:
- POST to origin:9999, followed by GET to origin:80, arrive at origin:9999 and
origin:9999 respectively
This results in requests being sent to incorrect origin server ports, which may
result in the HTML app getting unexpected errors, or unexpected data. A
workaround is to disable keep-alive, but that is of course bad for performance.
It looks like the problem is in proxy/http/HttpSM.cc, in function
HttpSM::do_http_server_open():
{noformat}
4438 void
4439 HttpSM::do_http_server_open(bool raw)
4440 {
...
4510 if (raw == false && t_state.txn_conf->share_server_sessions &&
4511 (t_state.txn_conf->keep_alive_post_out == 1 ||
t_state.hdr_info.request_content_length == 0) &&
4512 !is_private() && ua_session != NULL) {
...
4540 else if ((!t_state.txn_conf->share_server_sessions || is_private())
&& (ua_session != NULL)) {
4541 HttpServerSession *existing_ss = ua_session->get_server_session();
4542
4543 if (existing_ss) {
4544 // [amc] Is this OK? Should we compare ports? (not done by
ats_ip_addr_cmp)
4545 if (ats_ip_addr_eq(&existing_ss->server_ip.sa,
&t_state.current.server->addr.sa)) {
4546 ua_session->attach_server_session(NULL);
4547 existing_ss->state = HSS_ACTIVE;
4548 this->attach_server_session(existing_ss);
4549 hsm_release_assert(server_session != NULL);
4550 handle_http_server_open();
4551 return;
4552 } else {
4553 // As this is in the non-sharing configuration, we want to close
4554 // the existing connection and call connect_re to get a new one
4555 existing_ss->release();
4556 ua_session->attach_server_session(NULL);
4557 }
4558 }
4559 }
{noformat}
When the condition in line 4540 is hit, and get_server_session() returns an
existing session, the addresses of the existing session and the current one are
compared (line 4545), but *not* the port numbers, as the comment in line 4544
indicates. This is incorrect, as an existing session to a different port on
the same origin server should never be reused.
The most obvious fix is to add a comparison of the ports, e.g.:
{noformat}
if (ats_ip_addr_eq(&existing_ss->server_ip.sa,
&t_state.current.server->addr.sa) &&
ats_ip_port_cast(&existing_ss->server_ip.sa) ==
ats_ip_port_cast(&t_state.current.server->addr)) {
{noformat}
I have added this locally as a patch, and with that, the scenario described at
the top of this bug does not occur anymore.
Note that the issue in this particular scenario seems to be masked by the fix
for TS-312. Initially, I tried reproducing with trunk, but I found it only
occurs before that fix. After TS-312, the condition in line 4540 above is not
hit anymore, but the first one in line 4510 is hit instead. However, there may
still be ways to make the execution flow branch to line 4545, and that could
still trigger the bug.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)