[jira] [Commented] (TS-3085) Large POSTs over (relatively) slower connections failing in ats5

Sudheer Vinukonda (JIRA) Thu, 18 Sep 2014 10:38:16 -0700

    [ 
https://issues.apache.org/jira/browse/TS-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139224#comment-14139224
 ]


Sudheer Vinukonda commented on TS-3085:
---------------------------------------

When a POST fails, below is the log (slightly enhanced and traced using 
single/ip debugging in production):

{code}
[Sep 18 17:09:26.382] Server {0x2ab554605700} DEBUG: (ssl) 
[SSL_NetVConnection::ssl_read_from_net] b->write_avail()=32768
[Sep 18 17:09:26.382] Server {0x2ab554605700} DEBUG: (ssl) 
[SSL_NetVConnection::ssl_read_from_net] rres=-1
[Sep 18 17:09:26.382] Server {0x2ab554605700} DEBUG: (ssl.error) 
[SSL_NetVConnection::ssl_read_from_net] error 1
[Sep 18 17:09:26.382] Server {0x2ab554605700} DEBUG: (http_tunnel) [510166] 
producer_handler [user agent post VC_EVENT_ERROR]
[Sep 18 17:09:26.382] Server {0x2ab554605700} DEBUG: (http_redirect) 
[HttpTunnel::producer_handler] enable_redirection: [1 0 0] event: 3
[Sep 18 17:09:26.382] Server {0x2ab554605700} DEBUG: (http) [510166] 
[&HttpSM::tunnel_handler_post_ua, VC_EVENT_ERROR]
{code}

> Large POSTs over (relatively) slower connections failing in ats5
> ----------------------------------------------------------------
>
>                 Key: TS-3085
>                 URL: https://issues.apache.org/jira/browse/TS-3085
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: SSL
>    Affects Versions: 5.0.1
>            Reporter: Sudheer Vinukonda
>            Assignee: Sudheer Vinukonda
>              Labels: yahoo
>             Fix For: 5.2.0
>
>
> We ran into a production issue where large POSTs (30MB or high) are failing 
> over slower connection speeds after ats5 roll out (the problem could be 
> easily reproduced using a charles proxy with throttling enabled). 
> Further debugging isolated the issue to uploads over SSL connections and 
> after a lot of debugging the issue appears to be the below:
> ATS calls SSL_read() followed by SSL_get_error() to check if there was any 
> error in the read. This is repeated until either the complete data is read or 
> an error occurs. However, from the openssl documentation, it is recommended 
> to call ERR_clear_error() prior to calling SSL_read() + SSL_get_error() to 
> ensure the error queue is clean of any leftover/garbage errors.  It's not 
> clear what might be corrupting the error queue of the SSL context in a tight 
> loop - possibly, some new feature in ats5. In any case, calling 
> ERR_clear_error() is a good idea and adding this seems to resolve the post 
> failures.
> Documentation from openSSL and some related notes on stackoverflow:
> https://www.openssl.org/docs/ssl/SSL_get_error.html
> http://stackoverflow.com/questions/18179128/how-to-manage-the-error-queue-in-openssl-ssl-get-error-and-err-get-error
> {code}
> "SSL_get_error() returns a result code (suitable for the C ``switch''
> statement) for a preceding call to SSL_connect(), SSL_accept(),
> SSL_do_handshake(), SSL_read(), SSL_peek(), or SSL_write() on ssl. The value
> returned by that TLS/SSL I/O function must be passed to SSL_get_error() in
> parameter ret.
> In addition to ssl and ret, SSL_get_error() inspects the current thread's
> OpenSSL error queue. Thus, SSL_get_error() must be used in the same thread 
> that
> performed the TLS/SSL I/O operation, and no other OpenSSL function calls 
> should
> appear in between. The current thread's error queue must be empty before the
> TLS/SSL I/O operation is attempted, or SSL_get_error() will not work 
> reliably."
> "SSL_get_error does not call ERR_get_error. So if you just call SSL_get_error,
> the error stays in the queue.
> You should be calling ERR_clear_error prior to ANY SSL-call(SSL_read, 
> SSL_write
> etc) that is followed by SSL_get_error, otherwise you may be reading an old
> error that occurred previously in the current thread."
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TS-3085) Large POSTs over (relatively) slower connections failing in ats5

Reply via email to