Re: cURL read and write buffering vs. OpenSSL [was: Windows users!...]

2018-08-22 Thread Daniel Stenberg via curl-library

On Wed, 22 Aug 2018, Brad Spencer via curl-library wrote:

So overall, this is pretty unfortunate.  Perhaps someone familiar with 
setting up OpenSSL BIO chains than might be able to tweak how cURL drives 
OpenSSL to use buffered reading (and writing?) here.


Yeah, that would be a good idea. I would expect that to boost transfer speeds 
a bit.


There are also additional benefits with doing something like that, like 
perhaps we could finally stop getting SIGPIPE generated during SSL shutdowns!


--

 / daniel.haxx.se
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

cURL read and write buffering vs. OpenSSL [was: Windows users!...]

2018-08-22 Thread Brad Spencer via curl-library

On 2018-08-14 7:02 AM, Daniel Stenberg via curl-library wrote:
When I instead did the same upload over HTTPS to the same host but 
forced HTTP/1.1 the speeds were all remarkably similar. 500MB in 5 
seconds should be just about maximum for 1000mbit...


  Size Seconds  Improvement

  16KB 5.872    -
  64KB 5.838    x 1
  512KB    5.841    x 1


I'm a bit late to this large thread, so this might have already been 
discussed, but I happened to be recently taking a look at read and write 
buffering in cURL myself, so I thought I'd share what I found.


I first experimented with the read buffer size.  The command-line tool 
(in 7.61.0) conveniently sets its default buffer size to 102400 bytes, 
so it's a nice place to start.  Using strace, it's easy to show that 
HTTP recvfrom() calls vary in size depending on how fast content is 
appearing.  This is good.  For example:


recvfrom(3, ""..., 102400, 0, NULL, NULL) = 61155
recvfrom(3, ""..., 102400, 0, NULL, NULL) = 3085
recvfrom(3, ""..., 102400, 0, NULL, NULL) = 61320
recvfrom(3, ""..., 102400, 0, NULL, NULL) = 2920
recvfrom(3, ""..., 102400, 0, NULL, NULL) = 62615
recvfrom(3, ""..., 102400, 0, NULL, NULL) = 1625
recvfrom(3, ""..., 102400, 0, NULL, NULL) = 62780
recvfrom(3, ""..., 102400, 0, NULL, NULL) = 1460

But, switch to HTTPS (with OpenSSL), and suddenly every read is 
approximately 16 KB.  Well, actually, it's worse than that because there 
are "extra" tiny reads in between, too:


read(3, ""..., 5)   = 5
read(3, ""..., 16084)   = 16084
read(3, ""..., 5)   = 5
read(3, ""..., 16084)   = 16084
read(3, ""..., 5)   = 5
read(3, ""..., 16084)   = 16084
read(3, ""..., 5)   = 5
read(3, ""..., 16084)   = 16084

This suboptimal behaviour is basically due to OpenSSL.  Even with 
non-blocking I/O, OpenSSL seems to read each TLS record individually, 
and in fact as two read() calls: one for the 5-byte record header and 
then another for the (maximum 16 KB) record body.  No tweaking of cURL's 
read buffer size will change this pattern.


The same seems to apply to writes.  Hacking the UPLOAD_BUFSIZE to be 128 
KB, we see HTTP writes work as expected:


sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072
sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072
sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072
sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072
sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072
sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072
sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072
sendto(3, ""..., 131072, MSG_NOSIGNAL, NULL, 0) = 131072

But switch to HTTPS with OpenSSL and everything funnels through 
SSL_write() and we get the same effect.  At least the whole TLS record, 
including its header, is written all at once.


write(3, ""..., 16413)  = 16413
write(3, ""..., 16413)  = 16413
write(3, ""..., 16413)  = 16413
write(3, ""..., 16413)  = 16413
write(3, ""..., 16413)  = 16413
write(3, ""..., 16413)  = 16413
write(3, ""..., 16413)  = 16413
write(3, ""..., 16413)  = 16413

So overall, this is pretty unfortunate.  Perhaps someone familiar with 
setting up OpenSSL BIO chains than might be able to tweak how cURL 
drives OpenSSL to use buffered reading (and writing?) here.



--
Brad Spencer
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Using A Different Socket For Requests

2018-08-22 Thread Daniel Jeliński via curl-library
W dniu środa, 22 sierpnia 2018 Isaiah Banks via curl-library <
curl-library@cool.haxx.se> napisał(a):
> What I'd like to do is create a custom socket for all curl requests to go
through within a web application.
> I'm creating this socket within Python application but would like an app
written in PHP to send request through it.

I'm not sure if I understand your request correctly. Do you want to send
requests from curl in PHP to some remote server, but also capture all data
going through in your python application?

If you just want to capture traffic, check out CURLOPT_DEBUGFUNCTION,
Fiddler or Wireshark. If indeed you want to have all traffic in your python
application, CURLOPT_PROXY is probably what you're after.
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: CURLOPT_ACCEPT_ENCODING and unknown / unsolicited encodings

2018-08-22 Thread Rainer Canavan via curl-library
On Wed, Aug 22, 2018 at 5:07 PM Patrick Monnerat via curl-library
 wrote:

Thanks for your prompt response.

[...]
> https://github.com/curl/curl/commit/dbcced8e32b50c068ac297106f0502ee200a1ebd#diff-ff9fb98500e598660ec2dcd2d8193aac
> > if I'm not mistaken.
> This would have helped us much to have the curl version rather than the
> Ubuntu's. After grepping Ubuntu's repository, it appears that curl's
> version is 7.58.0.

7.58.0 is indeed correct.

> > In curl versions up to at least 7.56.0, setting
> > CURLOPT_ACCEPT_ENCODING to values other than NULL resulted in curl
> > decoding "gzip" and "deflate" and quietly passing any other Encoding,
> > such as "None", which is mistakenly used by one of our customers.
> "None" is recognized from 7.59.0, containing the commit
> https://github.com/curl/curl/commit/f886cbfe9c3055999d8174b2eedc826d0d9a54f1
> that implements it. Maybe Ubuntu should upgrade ;-)

That would indeed solve the immediate problem. I've opened a bug
report for Ubuntu at https://bugs.launchpad.net/ubuntu/+source/curl/+bug/1788435
although we do have another workaround running already.

thanks,

Rainer
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: CURLOPT_ACCEPT_ENCODING and unknown / unsolicited encodings

2018-08-22 Thread Patrick Monnerat via curl-library


On 08/22/2018 04:43 PM, Rainer Canavan via curl-library wrote:

Apologies for dredging up an issue that has been apparently been in
published curl versions since at least about a year, but we've only
just encountered it while upgrading a system from Ubuntu 17.10 to
18.04. The relevant commit is
https://github.com/curl/curl/commit/dbcced8e32b50c068ac297106f0502ee200a1ebd#diff-ff9fb98500e598660ec2dcd2d8193aac
if I'm not mistaken.
This would have helped us much to have the curl version rather than the 
Ubuntu's. After grepping Ubuntu's repository, it appears that curl's 
version is 7.58.0.


In curl versions up to at least 7.56.0, setting
CURLOPT_ACCEPT_ENCODING to values other than NULL resulted in curl
decoding "gzip" and "deflate" and quietly passing any other Encoding,
such as "None", which is mistakenly used by one of our customers.
"None" is recognized from 7.59.0, containing the commit 
https://github.com/curl/curl/commit/f886cbfe9c3055999d8174b2eedc826d0d9a54f1 
that implements it. Maybe Ubuntu should upgrade ;-)


---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

CURLOPT_ACCEPT_ENCODING and unknown / unsolicited encodings

2018-08-22 Thread Rainer Canavan via curl-library
Apologies for dredging up an issue that has been apparently been in
published curl versions since at least about a year, but we've only
just encountered it while upgrading a system from Ubuntu 17.10 to
18.04. The relevant commit is
https://github.com/curl/curl/commit/dbcced8e32b50c068ac297106f0502ee200a1ebd#diff-ff9fb98500e598660ec2dcd2d8193aac
if I'm not mistaken.

In curl versions up to at least 7.56.0, setting
CURLOPT_ACCEPT_ENCODING to values other than NULL resulted in curl
decoding "gzip" and "deflate" and quietly passing any other Encoding,
such as "None", which is mistakenly used by one of our customers.
Newer versions of curl return (61) "Unrecognized content encoding
type...". The new behavior is documented in INTERNALS.md (and its
predecessors) since 019c4088cf from April 2003 (with a minor error,
see patch). https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
on the other hand does not specify how unknown encodings are handled -
I would suggest copying the relevant sentece from INTERNALS.md in
there.

As far as I can see, there are no options or combinations of options
that can be set to restore the old behavior, which, at least for us,
is desirable in that we can handle unknown encodings ourselves, in
most cases by passing the unaltered response to the requestor, or in
the aforementioned case, ignoring "None". Am I overlooking something,
or is there any chance to get the old behavior back in a future
release, e.g. by requiring a specific value for
CURLOPT_ACCEPT_ENCODING, a new option, maybe
CURLOPT_IGNORE_UNKNOWN_CONENT_ENCODING, or possibly a somewhat more
sane method?

Rainer
diff --git a/docs/INTERNALS.md b/docs/INTERNALS.md
index ab04fec7e..944f26e06 100644
--- a/docs/INTERNALS.md
+++ b/docs/INTERNALS.md
@@ -678,7 +678,7 @@ Content Encoding
  understands how to process responses that use the "deflate", "gzip" and/or
  "br" content encodings, so the only values for [`CURLOPT_ACCEPT_ENCODING`][5]
  that will work (besides "identity," which does nothing) are "deflate",
- "gzip" and "br". If a response is encoded using the "compress" or methods,
+ "gzip" and "br". If a response is encoded using "compress" or any other unsupported methods,
  libcurl will return an error indicating that the response could
  not be decoded.  If  is NULL no Accept-Encoding header is generated.
  If  is a zero-length string, then an Accept-Encoding header
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Using A Different Socket For Requests

2018-08-22 Thread Gisle Vanem via curl-library

Isaiah Banks wrote:

I'm sure this question has been asked before, but in doing some online research I have not found a good answer. What I'd 
like to do is create a custom socket for all curl requests to go through within a web application.


I'm creating this socket within Python application but would like an app written in PHP to send request through it. I've 
tried creating it, binding it to a port, and then passing the port number to the cURL request using CURLOPT_LOCALPORT, 
but it didn't seem to work. Is this the only way to do it and still be able to monitor data packets? Or is there another 
recommended way?


W/o knowing any details or especially PHP, I assume you'll
need to call:
  setsockopt (sock, SOL_SOCKET, SO_REUSEADDR,..

before a bind() and before the socket can be shared like this.
Or use SO_EXCLUSIVEADDRUSE depending on you OS.

--
--gv
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Using A Different Socket For Requests

2018-08-22 Thread Isaiah Banks via curl-library
Hello,

I'm sure this question has been asked before, but in doing some online
research I have not found a good answer. What I'd like to do is create a
custom socket for all curl requests to go through within a web application.

I'm creating this socket within Python application but would like an app
written in PHP to send request through it. I've tried creating it, binding
it to a port, and then passing the port number to the cURL request using
CURLOPT_LOCALPORT, but it didn't seem to work. Is this the only way to do
it and still be able to monitor data packets? Or is there another
recommended way?

Thanks in advance!
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html