Re: Retrieve all addresses mapped to specific host, not just one IP
On Wed, 15 Aug 2018, myLC--- via curl-library wrote: Just upfront: you make it sound like my request for being able to get all associated IPs is a bizarre idea. I feel the need to state that many libraries have this implemented naturally for a good reason (a: it doesn't cost anything + b: in some scenarios, you need it). I don't think it is a bizarre idea but I'm very much in favour of having features well motivated and that the reasoning and use cases are better understood. It helps avoid misunderstandings and implementations that end up not fulfilling them. Also, libcurl has provided internet transfer powers to applications for nearly two decades by now and all those existing users have managed without this feature so I think a little scepticism is natural. So thanks for enduring the questions! -- / daniel.haxx.se --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
Just upfront: you make it sound like my request for being able to get all associated IPs is a bizarre idea. I feel the need to state that many libraries have this implemented naturally for a good reason (a: it doesn't cost anything + b: in some scenarios, you need it). The upcoming C++ networking will have this: "The class template basic_resolver_results satisfies the requirements of a sequence container ... The class template basic_resolver_results supports forward iterators." Translation: You get the list of IPs. Qt has this: "QList QHostInfo::addresses() const Returns the list of IP addresses associated with hostName()." Many others have this as well. The C++ standard committee rarely implements anything that is not going to be used. On Tue, 14 Aug 2018, Gisle Vanem wrote: > "should favour" how? Based on what; that IPv6 is > better/speedier than IPv4, or some addresses based on Geo- > location is best? libcurl knows zero about this. It would > be cool if it did though. This should be handled by the layers below libcurl. The function(s) used by libcurl can or should return the list with prioritization in mind. A short discussion about this can be found here: https://stackoverflow.com/questions/11241339/will-getaddrinfo-return-ipv6-addresses-first On Tue, 14 Aug 2018, Richard Gray wrote: > OK, filtering - got it. That's not a small area, btw. ;-) > You still haven't indicated what kind of access(es) you > are trying to perform with the potentially multiple > addresses. Are trying to just find the first one that > works? Are you trying to actually access more than one of > them? The later case might be something like testing the > various hosts behind a load leveler. Your application might simply choose to exclude some for particular reasons, while preferring others. Or it might simply want to document the mappings. There are many applications for this... > No, you would do the resolves then tell libcurl to operate > on any returned address(es) you are interested in. Yes, this would essentially result in having the same code twice and having to maintain that code. Most/many other libraries didn't go for that approach. They simply hand you the list, which they have anyhow. --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
myLC--- via curl-library wrote: On Mon, 13 Aug 2018, Richard Gray wrote: > I'm confused about what it is you are trying to do with > the list of addresses? ... If you have a list of IPs/hostnames, this is necessary to identify duplicate entries (public VPNs or proxies, for instance). OK, filtering - got it. > It's not clear to me why you are trying to get libcurl to > return that address list. ... You still haven't indicated what kind of access(es) you are trying to perform with the potentially multiple addresses. Are trying to just find the first one that works? Are you trying to actually access more than one of them? The later case might be something like testing the various hosts behind a load leveler. > If you are on a modern system, you already have a way to > do this: getaddrinfo() or equivalent. That would imply doing it twice – libcurl would do it once and then you'd do the same afterwards. No, you would do the resolves then tell libcurl to operate on any returned address(es) you are interested in. If the host(s) don't have a problem with an access using a literal IP for the URL, just format URLs with literal IPs instead of host names. If the host(s) need the actual host name you resolved against for TLS or other purposes, will CURLOPT_CONNECT_TO not do what you want by telling libcurl not to resolve the host name but instead use the IP you supply?? With either of these options, the host is not redundantly resolved and you are in complete control of what IPs are ignored/accessed, if they are accessed sequentially or in parallel, etc. I think this might be what you are after. I don't see how an extension to get the full address list would help because you'd still have to do something like the above for the rest of any addresses you were interested in anyway. Cheers! Rich --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
myLC--- wrote: Prioritization (which IPs libcurl should favor) might become an issue then. "should favour" how? Based on what; that IPv6 is better/speedier than IPv4, or some addresses based on Geo-location is best? libcurl knows zero about this. It would be cool if it did though. In fact IPv6 can be a lot slower than IPv4. My case right now (with IPv6 over a '6to4' tunnel) is that a: curl -6 server-in-Oslo-Norway goes via California! (3 times slow that with 'curl -4'). So much this hyped-up IPv6 protocol. -- --gv --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On Mon, 13 Aug 2018, Richard Gray wrote: > I'm confused about what it is you are trying to do with > the list of addresses? ... If you have a list of IPs/hostnames, this is necessary to identify duplicate entries (public VPNs or proxies, for instance). > It's not clear to me why you are trying to get libcurl to > return that address list. ... > If you are on a modern system, you already have a way to > do this: getaddrinfo() or equivalent. That would imply doing it twice – libcurl would do it once and then you'd do the same afterwards. > I guess I'm wondering if it makes more sense for your > application to get the list of addresses itself and then > tell libcurl what to do with them. Yes, but libcurl already does the same. Of course, you can do it yourself and hand the list of IPs to libcurl. Prioritization (which IPs libcurl should favor) might become an issue then. This way, you'd end up having to mimic just about everything libcurl normally does. Furthermore, you'd have to devise a way to hand over that information in a proper form. This would be essentially the same in reverse. It would result in having the same code twice in your (static) binaries, though. --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
https://curl.haxx.se//mail/lib-2018-08/0137.html myLC--- via curl-library wrote: Hello, :-) I'm new to (lib)curl. I decided that this was the most complete network implementation, after ditching Qt for continuous multithreading issues. I would like to know how we can retrieve all the IP addresses which are mapped to a host. Assuming we have the URL https://example.buzz/bingo_results/ and assuming further that there are 5 addresses mapped to this hostname: 10.0.0.11, 10.0.0.12, 10.0.0.13, fd0e:34f4:760f:5bd6:0123:4567:89ab:cdef, fd0e:34f4:760f:5bd6:::: How would I get them from libcurl? The "inbound" network implementation (TR) of C++ (Boost ASIO) returns an iterator upon having resolved a hostname. If I'm correct, the implementation chooses the IP randomly when connecting. Likewise, it tries the entire list of IPs before giving up. I haven't checked out libcurl's source- code, but I'm guessing that you will do something similar. Virtually all the functions for resolving hostnames return all the addresses (getaddrinfo, for instance). Therefore, at some point, libcurl must have this information. So, how do I get that list of all addresses? I'm inquiring, because I would hate to employ a second engine/dispatcher for this. I'm confused about what it is you are trying to do with the list of addresses? - connect to each one and perform an operation? - try to connect (simultaneously?) to several and find the first one that works? Are you asking libcurl to do a normal operation on a given host URL with host name and oh, BTW, return whatever address list libcurl resolved?? This seems like a highly specialized feature to add to libcurl. It's not clear to me why you are trying to get libcurl to return that address list. If you are on a modern system, you already have a way to do this: getaddrinfo() or equivalent. I guess I'm wondering if it makes more sense for your application to get the list of addresses itself and then tell libcurl what to do with them. Is it good enough to create URLs with each IP literal? Perhaps what you are looking for is a way to perform the same operation on a given URL but with different IP addresses? (All hosts see the same textual host name?) Possibly simultaneously via multi? Or have I totally missed something? To have discussions go right into API details suggests that this is of more value than I thought, even though I don't really understand why. Cheers! Rich --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On Mon, 13 Aug 2018, myLC--- via curl-library wrote: With a callback you'd make sure that the user takes care of it when the data becomes available in that format - and only then. Yes, but since we need to copy the entire data to the export struct, keeping that memory around a little longer is not much harder. But of course, if done for a lot of connections with a lot of addresses, doing the callback approach saves notable amounts of memory. Plus the fact that if there's no callback, there's no need to do any allocs/copies, which of course is a big plus. A callback approach might also get away with less copying and more "pointing". The addrinfo structure is well documented (spares you the hassle of writing it up yourself;-). 1. we support platforms without it 2. we can't assume that users will find the struct somewhere else (and that it will be sufficiently documented where found) so we still need to document it if we use it in our API If you name the constant/callback after this structure, you can have a new one if the internal structure changes. That could easily get annoying and require API changes too often. Separating external and internal structs is just sensible and more likely to keep us sane in the future. -- / daniel.haxx.se --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On 08/13/2018 12:24 PM, Daniel Stenberg wrote: > ... > Sure, I'm not totally against a callback. But we'd still have to convert > the representation to something that we think we can stick to for the > foreseeable future even if the internals would change... I do not really care what the structure looks like. I simply predict, that the entire caboose will be converted twice in most cases. With a callback you'd make sure that the user takes care of it when the data becomes available in that format - and only then. The addrinfo structure is well documented (spares you the hassle of writing it up yourself;-). If you name the constant/callback after this structure, you can have a new one if the internal structure changes. Then the old addrinfo export would do the copying and be marked as depreciated (you'd then delete the created structures after the callback returns). I simply look at it from a performance angle. Who needs all addresses? This is the exception. It's likely that programs dealing with lots of entries will use this. They will probably use their own format as they will store other information, and they might not need all the data/addresses from your structures. What you suggest surely sounds 'cleaner'. Nonetheless, you'll get the 'copy and convert the whole list twice for every lookup' almost guarantied. That is, your conversion will be discarded immediately afterwards, just to become something new entirely. I guess it depends on personal preference to determine, which of those solutions do hurt more. :-) --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On Mon, 13 Aug 2018, myLC--- via curl-library wrote: I think most existing users would presume such data to become available via curl_easy_getinfo(), like the primary IP and friends already are. I'm obvious not that acquainted with curl. I took a glance at the source. Unless I'm mistaken, you are using a renamed addrinfo struct on the inside. Yes, although not just renamed, it's an actual private version of the entire struct. Could it lead to problems with multiple threads, if you simply passed a pointer to that (chain of) struct(s) via curl_easy_getinfo? We never export internal data like that. Export data needs to get their own struct if a struct is to be used so that we don't mix internal representations with what we promise in external APIs/ABIs. But more so: there's no guarantee that we have the name resolved data left around after a transfer is complete as it is only saved in the DNS cache for a certain (customizable) time. We would probably need to convert the internal data to the exportable representation at the lookup time. By using the callback after having resolved the hostname, you'd dispose of this burden. Sure, I'm not totally against a callback. But we'd still have to convert the representation to something that we think we can stick to for the forseeable future even if the internals would change... -- / daniel.haxx.se --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
myLC---wrote: at the source. Unless I'm mistaken, you are using a renamed addrinfo struct on the inside. Could it lead to problems with multiple threads, if you simply passed a pointer to that (chain of) struct(s) via curl_easy_getinfo? It's copied to an internal structure inside 'singleipconnect()' AFAICS. -- --gv --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On 08/12/2018 06:55 PM, Daniel Stenberg wrote: > ... > We wouldn't expose internals, either way. If we would > provide the data in an array or struct somehow, that would > be made specifically for exporting purposes. That would be > no difference between providing the data in a callback or > post-transfer in a curl_easy_getinfo() call. > > I think most existing users would presume such data to > become available via curl_easy_getinfo(), like the primary > IP and friends already are. I'm obvious not that acquainted with curl. I took a glance at the source. Unless I'm mistaken, you are using a renamed addrinfo struct on the inside. Could it lead to problems with multiple threads, if you simply passed a pointer to that (chain of) struct(s) via curl_easy_getinfo? If so, you'd have to copy it (or convert it to something else). Users then probably copy the addresses to an entirely different format anyhow. By using the callback after having resolved the hostname, you'd dispose of this burden. That was my thinking. Given that I still know only very little about curl, however, that train of thought might be entirely off tracks. ;-) --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On Sun, 12 Aug 2018, myLC--- via curl-library wrote: However, I wouldn't be opposed somehow exposing the whole list if someone would like to work on it. If this happened through an optional callback after the hostname has been resolved, you wouldn't have to "expose internals", nor would you have to change the code much. We wouldn't expose internals, either way. If we would provide the data in an array or struct somehow, that would be made specifically for exporting purposes. That would be no difference between providing the data in a callback or post-transfer in a curl_easy_getinfo() call. I think most existing users would presume such data to become available via curl_easy_getinfo(), like the primary IP and friends already are. -- / daniel.haxx.se --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On 08/11/2018 01:47 PM, Daniel Stenberg wrote: However, I wouldn't be opposed somehow exposing the whole list if someone would like to work on it. If this happened through an optional callback after the hostname has been resolved, you wouldn't have to "expose internals", nor would you have to change the code much. I haven't seen how this is being handled internally. Is there something similar to the addrinfo struct? If this is handled "in a portable fashion" internally (same structure regardless of OS), then it boils down to passing the pointer to that. :-) --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
myLC---wrote: I would like to know how we can retrieve all the IP addresses which are mapped to a host. Assuming we have the URL https://example.buzz/bingo_results/ and assuming further that there are 5 addresses mapped to this hostname: 10.0.0.11, 10.0.0.12, 10.0.0.13, fd0e:34f4:760f:5bd6:0123:4567:89ab:cdef, fd0e:34f4:760f:5bd6:::: How would I get them from libcurl? Not sure you can get them all (unless one of them fail). But the "primary IP" could be fetched by: char ip [5*16]; // enough for IPv6? curl_easy_getinfo(curl, CURLINFO_PRIMARY_IP, ip); Or try my old 'sock_snoop.c' example attached. c:\>sock_snoop.exe -v http://www.vg.no * STATE: INIT => CONNECT handle 0x24bb028; line 1447 (connection #-5000) * Rebuilt URL to: www.vg.no/ * Added connection 0. The cache now contains 1 members * STATE: CONNECT => WAITRESOLVE handle 0x24bb028; line 1483 (connection #0) AF_INET: 195.88.54.16 * Trying 195.88.54.16... * Could not set TCP_NODELAY: Descriptor is not a socket * Immediate connect fail for 195.88.54.16: Descriptor is not a socket AF_INET: 195.88.55.16 * Trying 195.88.55.16... * Could not set TCP_NODELAY: Descriptor is not a socket * Immediate connect fail for 195.88.55.16: Descriptor is not a socket AF_INET6: 2001:67c:21e0::16 * Trying 2001:67c:21e0::16... * Could not set TCP_NODELAY: Descriptor is not a socket * Immediate connect fail for 2001:67c:21e0::16: Descriptor is not a socket * Closing connection 0 * The cache now contains 0 members * Expire cleared -- --gv #include #include #include static int snoop = 1; static SOCKET getsock (void *clientp, curlsocktype purpose, struct curl_sockaddr *ca) { const struct sockaddr_in *a4 = (const struct sockaddr_in*) &ca->addr; const struct sockaddr_in6 *a6 = (const struct sockaddr_in6*) &ca->addr; char buf [200]; switch (ca->family) { case AF_INET: printf ("AF_INET: %s\n", inet_ntop(ca->family, &a4->sin_addr, buf, sizeof(buf))); break; case AF_INET6: printf ("AF_INET6: %s\n", inet_ntop(ca->family, &a6->sin6_addr, buf, sizeof(buf))); break; } if (snoop) return 0; return socket (ca->family, ca->protocol, 0); } int main (int argc, char **argv) { CURL *curl = NULL; char scheme [200+1]; char url [200+1]; int verbose = 0; if (argc > 1 && !strcmp(argv[1],"-v")) { verbose = 1; argc--; argv++; } if (argc < 2 || sscanf(argv[1],"%20[^/:]://%s",scheme,url) != 2) { printf ("Usage: sock_snoop [-v] \n"); return (-1); } curl = curl_easy_init(); curl_easy_setopt (curl, CURLOPT_CONNECT_ONLY, 1); curl_easy_setopt (curl, CURLOPT_URL, url); curl_easy_setopt (curl, CURLOPT_OPENSOCKETFUNCTION, getsock); curl_easy_setopt (curl, CURLOPT_VERBOSE, verbose); curl_easy_perform (curl); curl_easy_cleanup (curl); return (0); } --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
Re: Retrieve all addresses mapped to specific host, not just one IP
On Sat, 11 Aug 2018, myLC--- via curl-library wrote: I would like to know how we can retrieve all the IP addresses which are mapped to a host. That list is not provided by libcurl. It will get that list internally and use that to connect to the host but the only address it offers through the API is the single IP it ended up using. However, I wouldn't be opposed somehow exposing the whole list if someone would like to work on it. The "inbound" network implementation (TR) of C++ (Boost ASIO) returns an iterator upon having resolved a hostname. If I'm correct, the implementation chooses the IP randomly when connecting. A "correct" client would use getaddrinfo() to get the list and that function is supposed to return the addresses in the "preferred" order so its not exactly even if it may appear so. Likewise, it tries the entire list of IPs before giving up. I haven't checked out libcurl's source- code, but I'm guessing that you will do something similar. libcurl goes through the list one by one until one works, yes. But it also does both IPv4 and IPv6 in parallel (so called happy eyeballs) and picks the one that connects first. -- / daniel.haxx.se --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html