Re: cURL read and write buffering vs. OpenSSL [was: Windows users!...]
On Wed, 22 Aug 2018, Brad Spencer via curl-library wrote: So overall, this is pretty unfortunate. Perhaps someone familiar with setting up OpenSSL BIO chains than might be able to tweak how cURL drives OpenSSL to use buffered reading (and writing?) here. Yeah, that would be a good idea. I would expect that to boost transfer speeds a bit. There are also additional benefits with doing something like that, like perhaps we could finally stop getting SIGPIPE generated during SSL shutdowns! -- / daniel.haxx.se --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
cURL read and write buffering vs. OpenSSL [was: Windows users!...]
On 2018-08-14 7:02 AM, Daniel Stenberg via curl-library wrote: When I instead did the same upload over HTTPS to the same host but forced HTTP/1.1 the speeds were all remarkably similar. 500MB in 5 seconds should be just about maximum for 1000mbit... Size Seconds Improvement 16KB 5.872 - 64KB 5.838 x 1 512KB 5.841 x 1 I'm a bit late to this large thread, so this might have already been discussed, but I happened to be recently taking a look at read and write buffering in cURL myself, so I thought I'd share what I found. I first experimented with the read buffer size. The command-line tool (in 7.61.0) conveniently sets its default buffer size to 102400 bytes, so it's a nice place to start. Using strace, it's easy to show that HTTP recvfrom() calls vary in size depending on how fast content is appearing. This is good. For example: recvfrom(3, ""..., 102400, 0, NULL, NULL) = 61155 recvfrom(3, ""..., 102400, 0, NULL, NULL) = 3085 recvfrom(3, ""..., 102400, 0, NULL, NULL) = 61320 recvfrom(3, ""..., 102400, 0, NULL, NULL) = 2920 recvfrom(3, ""..., 102400, 0, NULL, NULL) = 62615 recvfrom(3, ""..., 102400, 0, NULL, NULL) = 1625 recvfrom(3, ""..., 102400, 0, NULL, NULL) = 62780 recvfrom(3, ""..., 102400, 0, NULL, NULL) = 1460 But, switch to HTTPS (with OpenSSL), and suddenly every read is approximately 16 KB. Well, actually, it's worse than that because there are "extra" tiny reads in between, too: read(3, ""..., 5) = 5 read(3, ""..., 16084) = 16084 read(3, ""..., 5) = 5 read(3, ""..., 16084) = 16084 read(3, ""..., 5) = 5 read(3, ""..., 16084) = 16084 read(3, ""..., 5) = 5 read(3, ""..., 16084) = 16084 This suboptimal behaviour is basically due to OpenSSL. Even with non-blocking I/O, OpenSSL seems to read each TLS record individually, and in fact as two read() calls: one for the 5-byte record header and then another for the (maximum 16 KB) record body. No tweaking of cURL's read buffer size will change this pattern. The same seems to apply to writes. Hacking the UPLOAD_BUFSIZE to be 128 KB, we see HTTP writes work as expected: sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072 But switch to HTTPS with OpenSSL and everything funnels through SSL_write() and we get the same effect. At least the whole TLS record, including its header, is written all at once. write(3, ""..., 16413) = 16413 write(3, ""..., 16413) = 16413 write(3, ""..., 16413) = 16413 write(3, ""..., 16413) = 16413 write(3, ""..., 16413) = 16413 write(3, ""..., 16413) = 16413 write(3, ""..., 16413) = 16413 write(3, ""..., 16413) = 16413 So overall, this is pretty unfortunate. Perhaps someone familiar with setting up OpenSSL BIO chains than might be able to tweak how cURL drives OpenSSL to use buffered reading (and writing?) here. -- Brad Spencer --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Using A Different Socket For Requests
W dniu środa, 22 sierpnia 2018 Isaiah Banks via curl-library < curl-library@cool.haxx.se> napisał(a): > What I'd like to do is create a custom socket for all curl requests to go through within a web application. > I'm creating this socket within Python application but would like an app written in PHP to send request through it. I'm not sure if I understand your request correctly. Do you want to send requests from curl in PHP to some remote server, but also capture all data going through in your python application? If you just want to capture traffic, check out CURLOPT_DEBUGFUNCTION, Fiddler or Wireshark. If indeed you want to have all traffic in your python application, CURLOPT_PROXY is probably what you're after. --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: CURLOPT_ACCEPT_ENCODING and unknown / unsolicited encodings
On Wed, Aug 22, 2018 at 5:07 PM Patrick Monnerat via curl-library wrote: Thanks for your prompt response. [...] > https://github.com/curl/curl/commit/dbcced8e32b50c068ac297106f0502ee200a1ebd#diff-ff9fb98500e598660ec2dcd2d8193aac > > if I'm not mistaken. > This would have helped us much to have the curl version rather than the > Ubuntu's. After grepping Ubuntu's repository, it appears that curl's > version is 7.58.0. 7.58.0 is indeed correct. > > In curl versions up to at least 7.56.0, setting > > CURLOPT_ACCEPT_ENCODING to values other than NULL resulted in curl > > decoding "gzip" and "deflate" and quietly passing any other Encoding, > > such as "None", which is mistakenly used by one of our customers. > "None" is recognized from 7.59.0, containing the commit > https://github.com/curl/curl/commit/f886cbfe9c3055999d8174b2eedc826d0d9a54f1 > that implements it. Maybe Ubuntu should upgrade ;-) That would indeed solve the immediate problem. I've opened a bug report for Ubuntu at https://bugs.launchpad.net/ubuntu/+source/curl/+bug/1788435 although we do have another workaround running already. thanks, Rainer --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: CURLOPT_ACCEPT_ENCODING and unknown / unsolicited encodings
On 08/22/2018 04:43 PM, Rainer Canavan via curl-library wrote: Apologies for dredging up an issue that has been apparently been in published curl versions since at least about a year, but we've only just encountered it while upgrading a system from Ubuntu 17.10 to 18.04. The relevant commit is https://github.com/curl/curl/commit/dbcced8e32b50c068ac297106f0502ee200a1ebd#diff-ff9fb98500e598660ec2dcd2d8193aac if I'm not mistaken. This would have helped us much to have the curl version rather than the Ubuntu's. After grepping Ubuntu's repository, it appears that curl's version is 7.58.0. In curl versions up to at least 7.56.0, setting CURLOPT_ACCEPT_ENCODING to values other than NULL resulted in curl decoding "gzip" and "deflate" and quietly passing any other Encoding, such as "None", which is mistakenly used by one of our customers. "None" is recognized from 7.59.0, containing the commit https://github.com/curl/curl/commit/f886cbfe9c3055999d8174b2eedc826d0d9a54f1 that implements it. Maybe Ubuntu should upgrade ;-) --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
CURLOPT_ACCEPT_ENCODING and unknown / unsolicited encodings
Apologies for dredging up an issue that has been apparently been in published curl versions since at least about a year, but we've only just encountered it while upgrading a system from Ubuntu 17.10 to 18.04. The relevant commit is https://github.com/curl/curl/commit/dbcced8e32b50c068ac297106f0502ee200a1ebd#diff-ff9fb98500e598660ec2dcd2d8193aac if I'm not mistaken. In curl versions up to at least 7.56.0, setting CURLOPT_ACCEPT_ENCODING to values other than NULL resulted in curl decoding "gzip" and "deflate" and quietly passing any other Encoding, such as "None", which is mistakenly used by one of our customers. Newer versions of curl return (61) "Unrecognized content encoding type...". The new behavior is documented in INTERNALS.md (and its predecessors) since 019c4088cf from April 2003 (with a minor error, see patch). https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html on the other hand does not specify how unknown encodings are handled - I would suggest copying the relevant sentece from INTERNALS.md in there. As far as I can see, there are no options or combinations of options that can be set to restore the old behavior, which, at least for us, is desirable in that we can handle unknown encodings ourselves, in most cases by passing the unaltered response to the requestor, or in the aforementioned case, ignoring "None". Am I overlooking something, or is there any chance to get the old behavior back in a future release, e.g. by requiring a specific value for CURLOPT_ACCEPT_ENCODING, a new option, maybe CURLOPT_IGNORE_UNKNOWN_CONENT_ENCODING, or possibly a somewhat more sane method? Rainer diff --git a/docs/INTERNALS.md b/docs/INTERNALS.md index ab04fec7e..944f26e06 100644 --- a/docs/INTERNALS.md +++ b/docs/INTERNALS.md @@ -678,7 +678,7 @@ Content Encoding understands how to process responses that use the "deflate", "gzip" and/or "br" content encodings, so the only values for [`CURLOPT_ACCEPT_ENCODING`][5] that will work (besides "identity," which does nothing) are "deflate", - "gzip" and "br". If a response is encoded using the "compress" or methods, + "gzip" and "br". If a response is encoded using "compress" or any other unsupported methods, libcurl will return an error indicating that the response could not be decoded. If is NULL no Accept-Encoding header is generated. If is a zero-length string, then an Accept-Encoding header --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Using A Different Socket For Requests
Isaiah Banks wrote: I'm sure this question has been asked before, but in doing some online research I have not found a good answer. What I'd like to do is create a custom socket for all curl requests to go through within a web application. I'm creating this socket within Python application but would like an app written in PHP to send request through it. I've tried creating it, binding it to a port, and then passing the port number to the cURL request using CURLOPT_LOCALPORT, but it didn't seem to work. Is this the only way to do it and still be able to monitor data packets? Or is there another recommended way? W/o knowing any details or especially PHP, I assume you'll need to call: setsockopt (sock, SOL_SOCKET, SO_REUSEADDR,.. before a bind() and before the socket can be shared like this. Or use SO_EXCLUSIVEADDRUSE depending on you OS. -- --gv --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Using A Different Socket For Requests
Hello, I'm sure this question has been asked before, but in doing some online research I have not found a good answer. What I'd like to do is create a custom socket for all curl requests to go through within a web application. I'm creating this socket within Python application but would like an app written in PHP to send request through it. I've tried creating it, binding it to a port, and then passing the port number to the cURL request using CURLOPT_LOCALPORT, but it didn't seem to work. Is this the only way to do it and still be able to monitor data packets? Or is there another recommended way? Thanks in advance! --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html