[ 
https://issues.apache.org/jira/browse/TS-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865405#comment-13865405
 ] 

Dimitry Andric edited comment on TS-2481 at 1/8/14 1:18 PM:
------------------------------------------------------------

Suggested fix for TS-2481.


was (Author: dim):
Suggested fix for TS-1962.

> Incorrect origin server port used sometimes (with keep-alive)
> -------------------------------------------------------------
>
>                 Key: TS-2481
>                 URL: https://issues.apache.org/jira/browse/TS-2481
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: HTTP
>            Reporter: Dimitry Andric
>         Attachments: fix-ts-2481-1.diff
>
>
> After the fix for TS-1962, there are some scenarios where Traffic Server can 
> decide to reuse an origin server connection (because of keep-alive), even 
> when the destination port is different.  I encountered this with 4.0.2, 
> though this bug is still in trunk, but it is masked by another fix.
> When using Traffic Server 4.0.2 as a transparent proxy, two requests that are 
> made to the same origin server but on different ports, may be sent to the 
> same port.
> All GETs work as expected:
> * GET to origin:80 arrives at origin:80
> * GET to origin:80, followed by GET to origin:9999 arrive at origin:80 and
> origin:9999 respectively
> * GET to origin:9999, followed by GET to origin:80 arrive at origin:9999 and
> origin:80 respectively
> But stuff gets weird after a POST:
> - POST to origin:9999, followed by GET to origin:80, arrive at origin:9999 and
> origin:9999 respectively
> This results in requests being sent to incorrect origin server ports, which 
> may
> result in the HTML app getting unexpected errors, or unexpected data.  A 
> workaround is to disable keep-alive, but that is of course bad for 
> performance.
> It looks like the problem is in proxy/http/HttpSM.cc, in function 
> HttpSM::do_http_server_open():
> {noformat}
>   4438  void
>   4439  HttpSM::do_http_server_open(bool raw)
>   4440  {
> ...
>   4510    if (raw == false && t_state.txn_conf->share_server_sessions &&
>   4511        (t_state.txn_conf->keep_alive_post_out == 1 || 
> t_state.hdr_info.request_content_length == 0) &&
>   4512         !is_private() && ua_session != NULL) {
> ...
>   4540    else if ((!t_state.txn_conf->share_server_sessions || is_private()) 
> && (ua_session != NULL)) {
>   4541      HttpServerSession *existing_ss = ua_session->get_server_session();
>   4542
>   4543      if (existing_ss) {
>   4544        // [amc] Is this OK? Should we compare ports? (not done by 
> ats_ip_addr_cmp)
>   4545        if (ats_ip_addr_eq(&existing_ss->server_ip.sa, 
> &t_state.current.server->addr.sa)) {
>   4546          ua_session->attach_server_session(NULL);
>   4547          existing_ss->state = HSS_ACTIVE;
>   4548          this->attach_server_session(existing_ss);
>   4549          hsm_release_assert(server_session != NULL);
>   4550          handle_http_server_open();
>   4551          return;
>   4552        } else {
>   4553          // As this is in the non-sharing configuration, we want to 
> close
>   4554          // the existing connection and call connect_re to get a new 
> one
>   4555          existing_ss->release();
>   4556          ua_session->attach_server_session(NULL);
>   4557        }
>   4558      }
>   4559    }
> {noformat}
> When the condition in line 4540 is hit, and get_server_session() returns an 
> existing session, the addresses of the existing session and the current one 
> are compared (line 4545), but *not* the port numbers, as the comment in line 
> 4544 indicates.  This is incorrect, as an existing session to a different 
> port on the same origin server should never be reused.
> The most obvious fix is to add a comparison of the ports, e.g.:
> {noformat}
>       if (ats_ip_addr_eq(&existing_ss->server_ip.sa, 
> &t_state.current.server->addr.sa) &&
>           ats_ip_port_cast(&existing_ss->server_ip.sa) == 
> ats_ip_port_cast(&t_state.current.server->addr)) {
> {noformat}
> I have added this locally as a patch, and with that, the scenario described 
> at the top of this bug does not occur anymore.
> Note that the issue in this particular scenario seems to be masked by the fix 
> for TS-312.  Initially, I tried reproducing with trunk, but I found it only 
> occurs before that fix.  After TS-312, the condition in line 4540 above is 
> not hit anymore, but the first one in line 4510 is hit instead.  However, 
> there may still be ways to make the execution flow branch to line 4545, and 
> that could still trigger the bug.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to