Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-15 Thread Daniel Stenberg via curl-library

On Wed, 15 Aug 2018, myLC--- via curl-library wrote:

Just upfront: you make it sound like my request for being able to get all 
associated IPs is a bizarre idea. I feel the need to state that many 
libraries have this implemented naturally for a good reason (a: it doesn't 
cost anything + b: in some scenarios, you need it).


I don't think it is a bizarre idea but I'm very much in favour of having 
features well motivated and that the reasoning and use cases are better 
understood. It helps avoid misunderstandings and implementations that end up 
not fulfilling them.


Also, libcurl has provided internet transfer powers to applications for nearly 
two decades by now and all those existing users have managed without this 
feature so I think a little scepticism is natural.


So thanks for enduring the questions!

--

 / daniel.haxx.se
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-15 Thread myLC--- via curl-library


Just upfront: you make it sound like my request for being
able to get all associated IPs is a bizarre idea. I feel the
need to state that many libraries have this implemented
naturally for a good reason (a: it doesn't cost anything +
b: in some scenarios, you need it).

The upcoming C++ networking will have this:
"The class template basic_resolver_results satisfies the
 requirements of a sequence container ... The class template
 basic_resolver_results supports forward iterators."
Translation: You get the list of IPs.

Qt has this:
"QList QHostInfo::addresses() const
 Returns the list of IP addresses associated with hostName()."

Many others have this as well. The C++ standard committee
rarely implements anything that is not going to be used.


On Tue, 14 Aug 2018, Gisle Vanem wrote:

> "should favour" how? Based on what; that IPv6 is
> better/speedier than IPv4, or some addresses based on Geo-
> location is best? libcurl knows zero about this. It would
> be cool if it did though.

This should be handled by the layers below libcurl. The
function(s) used by libcurl can or should return the list
with prioritization in mind. A short discussion about this
can be found here:

https://stackoverflow.com/questions/11241339/will-getaddrinfo-return-ipv6-addresses-first



On Tue, 14 Aug 2018, Richard Gray wrote:

> OK, filtering - got it.

That's not a small area, btw. ;-)


> You still haven't indicated what kind of access(es) you
> are trying to perform with the potentially multiple
> addresses. Are trying to just find the first one that
> works? Are you trying to actually access more than one of
> them? The later case might be something like testing the
> various hosts behind a load leveler.

Your application might simply choose to exclude some for
particular reasons, while preferring others. Or it might
simply want to document the mappings. There are many
applications for this...


> No, you would do the resolves then tell libcurl to operate
> on any returned address(es) you are interested in.

Yes, this would essentially result in having the same code
twice and having to maintain that code. Most/many other
libraries didn't go for that approach. They simply hand you
the list, which they have anyhow.

---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-14 Thread Richard Gray via curl-library

myLC--- via curl-library wrote:

On Mon, 13 Aug 2018, Richard Gray wrote:

 > I'm confused about what it is you are trying to do with
 > the list of addresses?
...

If you have a list of IPs/hostnames, this is necessary to
identify duplicate entries (public VPNs or proxies, for
instance).


OK, filtering - got it.


 > It's not clear to me why you are trying to get libcurl to
 > return that address list.
...


You still haven't indicated what kind of access(es) you are trying to perform 
with the potentially multiple addresses.  Are trying to just find the first 
one that works?  Are you trying to actually access more than one of them?  The 
later case might be something like testing the various hosts behind a load 
leveler.




 > If you are on a modern system, you already have a way to
 > do this: getaddrinfo() or equivalent.

That would imply doing it twice – libcurl would do it once
and then you'd do the same afterwards.


No, you would do the resolves then tell libcurl to operate on any returned 
address(es) you are interested in.


If the host(s) don't have a problem with an access using a literal IP for the 
URL, just format URLs with literal IPs instead of host names.


If the host(s) need the actual host name you resolved against for TLS or other 
purposes, will CURLOPT_CONNECT_TO not do what you want by telling libcurl not 
to resolve the host name but instead use the IP you supply??


With either of these options, the host is not redundantly resolved and you are 
in complete control of what IPs are ignored/accessed, if they are accessed 
sequentially or in parallel, etc.   I think this might be what you are after. 
I don't see how an extension to get the full address list would help because 
you'd still have to do something like the above for the rest of any addresses 
you were interested in anyway.


Cheers!
Rich


---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-14 Thread Gisle Vanem via curl-library

myLC--- wrote:


Prioritization (which IPs libcurl should favor) might become
an issue then. 


"should favour" how? Based on what; that IPv6 is better/speedier
than IPv4, or some addresses based on Geo-location is best?
libcurl knows zero about this. It would be cool if it did though.

In fact IPv6 can be a lot slower than IPv4. My case right now
(with IPv6 over a '6to4' tunnel) is that a:
  curl -6 server-in-Oslo-Norway
goes via California! (3 times slow that with 'curl -4').

So much this hyped-up IPv6 protocol.

--
--gv
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-14 Thread myLC--- via curl-library

On Mon, 13 Aug 2018, Richard Gray wrote:

> I'm confused about what it is you are trying to do with
> the list of addresses?
...

If you have a list of IPs/hostnames, this is necessary to
identify duplicate entries (public VPNs or proxies, for
instance).


> It's not clear to me why you are trying to get libcurl to
> return that address list.
...
> If you are on a modern system, you already have a way to
> do this: getaddrinfo() or equivalent.

That would imply doing it twice – libcurl would do it once
and then you'd do the same afterwards.


> I guess I'm wondering if it makes more sense for your
> application to get the list of addresses itself and then
> tell libcurl what to do with them.

Yes, but libcurl already does the same. Of course, you can
do it yourself and hand the list of IPs to libcurl.
Prioritization (which IPs libcurl should favor) might become
an issue then. This way, you'd end up having to mimic just
about everything libcurl normally does. Furthermore, you'd
have to devise a way to hand over that information in a
proper form. This would be essentially the same in reverse.
It would result in having the same code twice in your
(static) binaries, though.


---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-13 Thread Richard Gray via curl-library

https://curl.haxx.se//mail/lib-2018-08/0137.html
myLC--- via curl-library wrote:

Hello, :-)


I'm new to (lib)curl. I decided that this was the most
complete network implementation, after ditching Qt for
continuous multithreading issues.

I would like to know how we can retrieve all the IP
addresses which are mapped to a host.

Assuming we have the URL
https://example.buzz/bingo_results/

and assuming further that there are 5 addresses mapped to
this hostname:
10.0.0.11, 10.0.0.12, 10.0.0.13,
fd0e:34f4:760f:5bd6:0123:4567:89ab:cdef,
fd0e:34f4:760f:5bd6::::

How would I get them from libcurl?

The "inbound" network implementation (TR) of C++ (Boost
ASIO) returns an iterator upon having resolved a hostname.
If I'm correct, the implementation chooses the IP randomly
when connecting. Likewise, it tries the entire list of IPs
before giving up. I haven't checked out libcurl's source-
code, but I'm guessing that you will do something similar.
Virtually all the functions for resolving hostnames return
all the addresses (getaddrinfo, for instance).
Therefore, at some point, libcurl must have this
information. So, how do I get that list of all addresses?
I'm inquiring, because I would hate to employ a second
engine/dispatcher for this.


I'm confused about what it is you are trying to do with the list of addresses?
- connect to each one and perform an operation?
- try to connect (simultaneously?) to several and find
  the first one that works?

Are you asking libcurl to do a normal operation on a given host URL with host 
name and oh, BTW, return whatever address list libcurl resolved??  This seems 
like a highly specialized feature to add to libcurl.


It's not clear to me why you are trying to get libcurl to return that address 
list.  If you are on a modern system, you already have a way to do this: 
getaddrinfo() or equivalent.


I guess I'm wondering if it makes more sense for your application to get the 
list of addresses itself and then tell libcurl what to do with them. Is it 
good enough to create URLs with each IP literal?  Perhaps what you are looking 
for is a way to perform the same operation on a given URL but with different 
IP addresses?  (All hosts see the same textual host name?) Possibly 
simultaneously via multi?



Or have I totally missed something?  To have discussions go right into API 
details suggests that this is of more value than I thought, even though I 
don't really understand why.


Cheers!
Rich
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-13 Thread Daniel Stenberg via curl-library

On Mon, 13 Aug 2018, myLC--- via curl-library wrote:

With a callback you'd make sure that the user takes care of it when the data 
becomes available in that format - and only then.


Yes, but since we need to copy the entire data to the export struct, keeping 
that memory around a little longer is not much harder. But of course, if done 
for a lot of connections with a lot of addresses, doing the callback approach 
saves notable amounts of memory. Plus the fact that if there's no callback, 
there's no need to do any allocs/copies, which of course is a big plus.


A callback approach might also get away with less copying and more "pointing".

The addrinfo structure is well documented (spares you the hassle of writing 
it up yourself;-).


1. we support platforms without it

2. we can't assume that users will find the struct somewhere else (and that it 
will be sufficiently documented where found) so we still need to document it 
if we use it in our API


If you name the constant/callback after this structure, you can have a new 
one if the internal structure changes.


That could easily get annoying and require API changes too often.

Separating external and internal structs is just sensible and more likely to 
keep us sane in the future.


--

 / daniel.haxx.se
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-13 Thread myLC--- via curl-library

On 08/13/2018 12:24 PM, Daniel Stenberg wrote:
> ...
> Sure, I'm not totally against a callback. But we'd still have to convert
> the representation to something that we think we can stick to for the
> foreseeable future even if the internals would change...

I do not really care what the structure looks like. I simply predict, that the 
entire
caboose will be converted twice in most cases.
With a callback you'd make sure that the user takes care of it when the data
becomes available in that format - and only then.
The addrinfo structure is well documented (spares you the hassle of writing it 
up
yourself;-). If you name the constant/callback after this structure, you can 
have a
new one if the internal structure changes. Then the old addrinfo export would do
the copying and be marked as depreciated (you'd then delete the created 
structures
after the callback returns).

I simply look at it from a performance angle. Who needs all addresses? This is 
the
exception. It's likely that programs dealing with lots of entries will use 
this. They
will probably use their own format as they will store other information, and 
they
might not need all the data/addresses from your structures.

What you suggest surely sounds 'cleaner'. Nonetheless, you'll get the 'copy and
convert the whole list twice for every lookup' almost guarantied. That is, your
conversion will be discarded immediately afterwards, just to become something
new entirely.

I guess it depends on personal preference to determine, which of those solutions
do hurt more. :-)


---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-13 Thread Daniel Stenberg via curl-library

On Mon, 13 Aug 2018, myLC--- via curl-library wrote:

I think most existing users would presume such data to become available via 
curl_easy_getinfo(), like the primary IP and friends already are.


I'm obvious not that acquainted with curl. I took a glance at the source. 
Unless I'm mistaken, you are using a renamed addrinfo struct on the inside.


Yes, although not just renamed, it's an actual private version of the entire 
struct.


Could it lead to problems with multiple threads, if you simply passed a 
pointer to that (chain of) struct(s) via curl_easy_getinfo?


We never export internal data like that. Export data needs to get their own 
struct if a struct is to be used so that we don't mix internal representations 
with what we promise in external APIs/ABIs.


But more so: there's no guarantee that we have the name resolved data left 
around after a transfer is complete as it is only saved in the DNS cache for a 
certain (customizable) time. We would probably need to convert the internal 
data to the exportable representation at the lookup time.


By using the callback after having resolved the hostname, you'd dispose of 
this burden.


Sure, I'm not totally against a callback. But we'd still have to convert the 
representation to something that we think we can stick to for the forseeable 
future even if the internals would change...


--

 / daniel.haxx.se
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-13 Thread Gisle Vanem via curl-library

myLC---wrote:


at the source. Unless I'm mistaken, you are using a renamed
addrinfo struct on the inside.
Could it lead to problems with multiple threads, if you
simply passed a pointer to that (chain of) struct(s) via
curl_easy_getinfo? 


It's copied to an internal structure inside 'singleipconnect()'
AFAICS.

--
--gv
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-13 Thread myLC--- via curl-library

On 08/12/2018 06:55 PM, Daniel Stenberg wrote:
> ...
> We wouldn't expose internals, either way. If we would
> provide the data in an array or struct somehow, that would
> be made specifically for exporting purposes. That would be
> no difference between providing the data in a callback or
> post-transfer in a curl_easy_getinfo() call.
>
> I think most existing users would presume such data to
> become available via curl_easy_getinfo(), like the primary
> IP and friends already are.


I'm obvious not that acquainted with curl. I took a glance
at the source. Unless I'm mistaken, you are using a renamed
addrinfo struct on the inside.
Could it lead to problems with multiple threads, if you
simply passed a pointer to that (chain of) struct(s) via
curl_easy_getinfo? If so, you'd have to copy it (or convert
it to something else). Users then probably copy the
addresses to an entirely different format anyhow.
By using the callback after having resolved the hostname,
you'd dispose of this burden. That was my thinking. Given
that I still know only very little about curl, however, that
train of thought might be entirely off tracks. ;-)


---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-12 Thread Daniel Stenberg via curl-library

On Sun, 12 Aug 2018, myLC--- via curl-library wrote:

However, I wouldn't be opposed somehow exposing the whole list if someone 
would like to work on it.


If this happened through an optional callback after the hostname has been 
resolved, you wouldn't have to "expose internals", nor would you have to 
change the code much.


We wouldn't expose internals, either way. If we would provide the data in an 
array or struct somehow, that would be made specifically for exporting 
purposes. That would be no difference between providing the data in a callback 
or post-transfer in a curl_easy_getinfo() call.


I think most existing users would presume such data to become available via 
curl_easy_getinfo(), like the primary IP and friends already are.


--

 / daniel.haxx.se
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-12 Thread myLC--- via curl-library

On 08/11/2018 01:47 PM, Daniel Stenberg wrote:

However, I wouldn't be opposed somehow exposing the
whole list if someone would like to work on it.


If this happened through an optional callback after the
hostname has been resolved, you wouldn't have to "expose
internals", nor would you have to change the code much.
I haven't seen how this is being handled internally.
Is there something similar to the addrinfo struct?
If this is handled "in a portable fashion" internally (same
structure regardless of OS), then it boils down to
passing the pointer to that. :-)


---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-11 Thread Gisle Vanem via curl-library

myLC---wrote:


I would like to know how we can retrieve all the IP
addresses which are mapped to a host.

Assuming we have the URL
https://example.buzz/bingo_results/

and assuming further that there are 5 addresses mapped to
this hostname:
10.0.0.11, 10.0.0.12, 10.0.0.13,
fd0e:34f4:760f:5bd6:0123:4567:89ab:cdef,
fd0e:34f4:760f:5bd6::::

How would I get them from libcurl?


Not sure you can get them all (unless one of them
fail). But the "primary IP" could be fetched by:
  char ip [5*16]; // enough for IPv6?
  curl_easy_getinfo(curl, CURLINFO_PRIMARY_IP, ip);

Or try my old 'sock_snoop.c' example attached.

c:\>sock_snoop.exe -v http://www.vg.no

* STATE: INIT => CONNECT handle 0x24bb028; line 1447 (connection #-5000)
* Rebuilt URL to: www.vg.no/
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => WAITRESOLVE handle 0x24bb028; line 1483 (connection #0)
AF_INET: 195.88.54.16

*   Trying 195.88.54.16...
* Could not set TCP_NODELAY: Descriptor is not a socket
* Immediate connect fail for 195.88.54.16: Descriptor is not a socket
AF_INET: 195.88.55.16

*   Trying 195.88.55.16...
* Could not set TCP_NODELAY: Descriptor is not a socket
* Immediate connect fail for 195.88.55.16: Descriptor is not a socket
AF_INET6: 2001:67c:21e0::16

*   Trying 2001:67c:21e0::16...
* Could not set TCP_NODELAY: Descriptor is not a socket
* Immediate connect fail for 2001:67c:21e0::16: Descriptor is not a socket
* Closing connection 0
* The cache now contains 0 members
* Expire cleared



--
--gv


#include 
#include 
#include 

static int snoop = 1;

static SOCKET getsock (void *clientp,
   curlsocktype purpose,
   struct curl_sockaddr *ca)
{
  const struct sockaddr_in  *a4 = (const struct sockaddr_in*) &ca->addr;
  const struct sockaddr_in6 *a6 = (const struct sockaddr_in6*) &ca->addr;
  char  buf [200];

  switch (ca->family)
  {
   case AF_INET:
printf ("AF_INET:  %s\n", inet_ntop(ca->family, &a4->sin_addr, buf, 
sizeof(buf)));
break;
   case AF_INET6:
printf ("AF_INET6: %s\n", inet_ntop(ca->family, &a6->sin6_addr, buf, 
sizeof(buf)));
break;
  }
  if (snoop)
 return 0;
  return socket (ca->family, ca->protocol, 0);
}

int main (int argc, char **argv)
{
  CURL *curl = NULL;
  char  scheme [200+1];
  char  url [200+1];
  int   verbose = 0;

  if (argc > 1 && !strcmp(argv[1],"-v"))
  {
verbose = 1;
argc--;
argv++;
  }

  if (argc < 2 || sscanf(argv[1],"%20[^/:]://%s",scheme,url) != 2)
  {
printf ("Usage: sock_snoop [-v] \n");
return (-1);
  }

  curl = curl_easy_init();

  curl_easy_setopt (curl, CURLOPT_CONNECT_ONLY, 1);
  curl_easy_setopt (curl, CURLOPT_URL, url);
  curl_easy_setopt (curl, CURLOPT_OPENSOCKETFUNCTION, getsock);
  curl_easy_setopt (curl, CURLOPT_VERBOSE, verbose);
  curl_easy_perform (curl);
  curl_easy_cleanup (curl);
  return (0);
}
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: Retrieve all addresses mapped to specific host, not just one IP

2018-08-11 Thread Daniel Stenberg via curl-library

On Sat, 11 Aug 2018, myLC--- via curl-library wrote:

I would like to know how we can retrieve all the IP addresses which are 
mapped to a host.


That list is not provided by libcurl. It will get that list internally and use 
that to connect to the host but the only address it offers through the API is 
the single IP it ended up using.


However, I wouldn't be opposed somehow exposing the whole list if someone 
would like to work on it.


The "inbound" network implementation (TR) of C++ (Boost ASIO) returns an 
iterator upon having resolved a hostname. If I'm correct, the implementation 
chooses the IP randomly when connecting.


A "correct" client would use getaddrinfo() to get the list and that function 
is supposed to return the addresses in the "preferred" order so its not 
exactly even if it may appear so.


Likewise, it tries the entire list of IPs before giving up. I haven't 
checked out libcurl's source- code, but I'm guessing that you will do 
something similar.


libcurl goes through the list one by one until one works, yes. But it also 
does both IPv4 and IPv6 in parallel (so called happy eyeballs) and picks the 
one that connects first.


--

 / daniel.haxx.se
---
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html