Unfortunately, it takes rather several hours of running to identify this failure; building a new package and bisecting is going to be quite difficult with that long of a turnaround.
I however was able to get a tcpdump of one of the affected sessions. This session spans several requests to the same backend (Typhoeus pools curl sockets and tries to use keep-alive whenever possible). The first few requests succeed; at some point there's a ~9-second break in requests while the process does other stuff and the server closes the TCP stream (our load balancer is configured to close idle keep-alive sessions after 9 seconds). Within a few milliseconds, another request is sent on the same socket; it immediately gets back a RST because the server has already closed this socket. That appears to cause this exception to be raised by libcurl. Did anything change around handling keep-alive session expiry in HTTP/1.1 mode? Nothing jumps out at me in the git log (there's something in the schannel backend, but I'm on Linux; there's also d5bb459ccf <https://github.com/curl/curl/commit/d5bb459ccf1fc5980ae4b95c05b4ecf6454a7599> which claims to only affect CONNECT-only connections, and all of this is regular GETs and POSTs)... On Fri, Sep 11, 2020 at 10:34 PM Ray Satiro via curl-library < curl-library@cool.haxx.se> wrote: > On 9/11/2020 2:03 PM, James Brown via curl-library wrote: > > After upgrading a test cluster from 7.71.1 to 7.72.0, we're now seeing > around 0.1% of POSTs from one (and only one) of our applications fail with > "Failed sending data to the peer" (CURLE_SEND_ERROR) and no other error. > Based on logs, the request actually succeeds, but libcurl is returning this > error. This application is using the Ruby Typhoeus wrapper and is itself > unchanged. The relevant connections are all HTTP/1.1 connections to hosts > on the local network, and the POSTs are all very small (<1KB) with nothing > interesting about them. > > I haven't had any luck tracking this down since it's such a low fraction > of requests and is only affecting one of our several hundred applications, > but it reproducibly happens with 7.72 and not with 7.71.1. > > Anyone have any suggestions for how to try to track down the regression? I > looked at the diff between 7.71.1 and 7.72.0 and no lines containing the > string "CURLE_SEND_ERROR" were touched, which is unfortunate. > > > There are no similar reports and I looked through the commit history but > nothing stood out. If you can reliably reproduce then try bisecting it > https://github.com/curl/curl/wiki/how-to-git-bisect > > > ------------------------------------------------------------------- > Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library > Etiquette: https://curl.haxx.se/mail/etiquette.html -- James Brown Engineer
------------------------------------------------------------------- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html