Re: glibc's getaddrinfo() sort order
On Mon, Sep 24, 2007 at 11:18:00AM +0100, Ian Jackson wrote: COMMON BEHAVIOUR ON TODAY'S INTERNET IS THAT IMPLEMENTED BY GETHOSTBYNAME. Common behavior for gethostbyname() on today's Internet is that implemented commonly in gethostbyname() . How many times do I have to explain this ? getaddrinfo is the REPLACEMENT FOR GETHOSTBYNAME. It is not an interface which applications choose because they want different address sorting behaviour. It is the interface applications MUST USE TO SUPPORT IPV6. I don't think that this is true. getipnodebyname() is an interface applications can use and is much more conducive to drop-in replacement given its interface. I am not recommending use of this function, but your leap of logic is strange. Changing applications to use getaddrinfo instead of gethostbyname is done BECAUSE THOSE APPLICATIONS ARE BEING UPDATED TO SUPPORT IPV6. I think it's done because it's a better, more standardized interface. Updating an application to support IPv6 should not change the way it treats DNS RRsets containing multiple IPv4 addresses. Obviously. Anything assuming what you assume about DNS resolution is going to break in the future eventually. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
[In all my comments below, I am assuming that we are focused on rule 9 as pertains to sorting of IPv4 addresses. A strict sorting of IPv6 addresses by length of prefix match is also questionable, but not so much so that I believe overruling is justified.] On Fri, Sep 21, 2007 at 01:07:49PM +1000, Anthony Towns wrote: On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote: So do you have a use case where you think the behavior described in rule 9 *is* desirable? Any application written assuming this behaviour, works correctly on Windows, Solaris, *BSD and glibc based systems in general, but not on Debian. So my argument here is that I don't believe there *are* any applications being written that assume this behavior; and that even if there were, such applications would either work just fine with the previous getaddrinfo() behavior, or be too pathological to live. The goal of RFC3484 is to describe how system resolvers should sort addresses in order to give applications the best address first. I think it's already established in this thread (correct me if I'm wrong) that this specific rule of sorting by the length of the prefix match does *not* further the goal of sorting addresses from most to least desirable. Instead, taken over the whole Internet rule 9 is statistically a pseudo-randomization relative to the *correct* sorting[1], but with features that make it an outright pessimization for particular real-world hostnames due to the set of addresses returned. I don't believe any sane application could depend on such pessimization. One of the existing use cases that breaks is round-robin DNS. Round-robin DNS is not an IETF standard; its use has been discouraged by various parties for years; it has limitations that make it unsuitable for any but the simplest of configurations. But none of these are reasons to willfully degrade currently working setups. They might be reasons why RR DNS would be an acceptable sacrifice in favor of other beneficial features, but rule 9 as written offers *no* benefits in the general case! Another possibility is a DNS server that intelligently sorts records based on knowledge of the client, but returns all the addresses instead of truncating the list. Arguably it would be more intelligent if the DNS server didn't return multiple records in this case, but again rule 9 is not an improvement, so why should it be honored? In the bug log, Pierre reported this behaviour is already supported on most of those sytems: ] On that matter, according to Aurelien, Vista (maybe XP), ] {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X ] and solaris come to mind). So it's kind of a decision of Debian vs. the ] rest of the world. And if I don't really care about the issue of the ] decision technically, this aspect worries me. The purpose of following standards is to foster interoperation between products of multiple vendors. Conforming with Section 6, rule 9 of RFC 3484 does *not* improve interoperation with other vendors. Instead, it breaks interoperation with existing Internet infrastructure. Ignoring rule 9 has no foreseeable negative consequences for Debian client systems. Even if this were an IETF standard and all the rest of the Internet implemented it, the only consequence of Debian ignoring rule 9 would be that Debian systems would continue to play better than everyone else with those sites that were in the process of transitioning away from round-robin DNS. So I don't see that much weight should be given to whether other operating system vendors choose to comply with a rule which is, fundamentally, misguided and broken. Furthermore, even if gethostbyname() has been deprecated in POSIX, it's relevant that there is still plenty of software in Debian that uses this interface[1]. Almost all of this software is going to be IPv4-only; if we want Debian to be fully IPv6-capable, these are programs that will need to be updated to use the getaddrinfo() interface, at which point they will cease to work correctly with round-robin DNS in the absence of additional code to re-randomize addresses(!). The more work that is needed to make an IPv4 application function correctly with both IPv4 and IPv6, the less likely it is that this work will get done; some maintainers may opt not to enable IPv6 support in their packages rather than use an interface that degrades behavior under IPv4. (I wonder how many of the applications in Debian are IPv6-enabled as a result of local patches, or build options that other vendors aren't enabling? Debian has touted its IPv6 support since long before other vendors considered it relevant; perhaps the other vendors that do follow rule 9 in their getaddrinfo() implementations would take another look too if their IPv6 support was more pervasive?) Even if you do have one, I still don't see any reason to think this is a reasonable default behavior on the real-world Internet. As it happens I largely agree
Re: glibc's getaddrinfo() sort order
* Clint Adams: On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote: glibc is the only implementation I know of that does this. I have heard, though not confirmed first-hand, that modern versions of FreeBSD, Windows, and Solaris do as well. FreeBSD 6.2-RELEASE doesn't do it. And neither does Fedora (with GNU libc 2.6.90-15, IPv6 not enabled). (Windows is an entirely different matter because the resolver model is completely different.) You can run the following test program repeatedly to check if every A record gets its chance. import socket print ', '.join(map(lambda x: x[4][0], socket.getaddrinfo('pool.ntp.org', 123, 0, socket.SOCK_DGRAM))) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On Sun, Sep 23, 2007 at 04:21:58AM -0700, Steve Langasek wrote: On Fri, Sep 21, 2007 at 01:07:49PM +1000, Anthony Towns wrote: On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote: So do you have a use case where you think the behavior described in rule 9 *is* desirable? Any application written assuming this behaviour, works correctly on Windows, Solaris, *BSD and glibc based systems in general, but not on Debian. So my argument here is that I don't believe there *are* any applications being written that assume this behavior; and that even if there were, such applications would either work just fine with the previous getaddrinfo() behavior, or be too pathological to live. There's two aspects to RFC3484's behaviour: first that it creates a much more stable ordering of its results than could have been expected otherwise, and second it tries to make that ordering more optimal than a random ordering would be wrt routing. Stability is useful for any case where the servers hosting a particular might be out of sync with each other; eg, if stability could be assumed we'd have less errors where an invocation of apt-get update chooses one mirror, and a subsequent apt-get upgrade chooses a different server that hasn't finished syncing. Hopefully apt-get isn't considered too pathological to live... Better routing has less direct benefits to the client, probably limited to slightly better ping times, with a small chance of somewhat cheaper bandwidth costs. For the people providing the service, it lets you make better assumptions as to load balancing -- you can expect the servers based in a particular area to be serving a load proportional to the number of users in that area, rather than having the load fairly evenly distributed globally. Of course, there are other ways of doing this that don't rely on how the client's resolver is implemented. Of course, if the routing is worse, those turn into drawbacks instead of benefits. Instead, taken over the whole Internet rule 9 is statistically a pseudo-randomization relative to the *correct* sorting[1], If that were the case it would be no worse than round-robin selection of preferred address. You can only take it over the whole Internet if you're assuming an equal distribution across all IPs, which isn't valid for IPv4 (where there's presumably a significant bias to private IPs), and presumably isn't valid for any particular service, which will be heavily biassed to particular IP ranges by correlation with location or language... One of the existing use cases that breaks is round-robin DNS. Round-robin DNS isn't broken; the expectation of (approximately) equal load-distribution across all servers in a round-robin is broken. They might be reasons why RR DNS would be an acceptable sacrifice in favor of other beneficial features, but rule 9 as written offers *no* benefits in the general case! Even without the possibility of applications like apt-get benefiting from stability of results, I don't think we've done anywhere enough of a review to be declaring that there aren't any benefits to rule 9. So I don't see that much weight should be given to whether other operating system vendors choose to comply with a rule which is, fundamentally, misguided and broken. As far as I can see, for rule 9 to be fundamentally misguided and broken, the concept of providing a stable answer, or a better than random ordering, would need to be harmful. If they're beneficial, even in some cases, then we've got a problem in the details of the specification, not a fundamental issue. (Note that prefix matching is the only reordering rule that has any effect in almost all actual cases, so without that rule or a replacement, both stability and any improvements in routing disappear) Note that stability isn't definitively a good thing -- if the first server you connect to happens to be the only one that's down/unreachable, then with a stable resolver you need to have specific failover code to use a different address; whereas if you can expect gethostbyname() to return a different first result, you can just rerun the program. Furthermore, even if gethostbyname() has been deprecated in POSIX, it's relevant that there is still plenty of software in Debian that uses this interface[1]. Almost all of this software is going to be IPv4-only; if we want Debian to be fully IPv6-capable, these are programs that will need to be updated to use the getaddrinfo() interface, at which point they will cease to work correctly with round-robin DNS in the absence of additional code to re-randomize addresses(!). Uh, round-robin DNS isn't a guarantee that any individual client will get different or randomised results -- and the argument that round-robin won't break anything that relies on rule 9 goes the other way too. Further, having getaddrinfo() behave differently for IPv4 and IPv6 isn't completely helpful in making Debian support IPv6 -- if we change a program
Re: glibc's getaddrinfo() sort order
* Anthony Towns: I don't agree with making a decision to go against an IETF standard RFC 3484 is not an IETF standard. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
Anthony Towns writes (Re: glibc's getaddrinfo() sort order): On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote: So do you have a use case where you think the behavior described in rule 9 *is* desirable? Any application written assuming this behaviour, works correctly on Windows, Solaris, *BSD and glibc based systems in general, but not on Debian. You're completely missing the point. Applications are NOT written assuming this behaviour. Applications are written assuming the behaviour of gethostbyname and then later the call to gethostbyname is replaced by getaddrinfo when the application is upgraded to support IPv6. Let us take a concrete example: http over tcp as implemented by curl. Originally, curl would call gethostbyname. gethostbyname would get its answers from the DNS, with the usual round robin. curl would then connect to the first address in the list. Across the population of calling clients, curl would (just as with other applications such as web browsers) pick the one of the available addresses with roughly equal probability. So that is what the DNS administrator for the site intends by the publication of multiple address records. Now, curl is changed to support IPv6. These are not very intrusive changes but one of them involves a pretty much direct replacement of gethostbyname with getaddrinfo. After being changed in this way curl will sort the addresses according to rule 9. This means for each client the address is always the same, and which one depends on the client's idea of its own address. Since clients are nowhere near uniformly distributed in the address space, this will direct the traffic quite non-uniformly. This is not what the DNS administrator had intended and represents a change to the behaviour. So to recap the three possibilites I mentioned were: ] (a) It is correct that the behaviour of applications (and hence of ] hosts) should be changed to comply with rule 9. Ie the DNS administrator was wrong, even if these DNS records were published before the change was made or before RFC3484 was written. I assume you're not proposing this. ] (b) Application behaviour should not change; getaddrinfo should ] behave the same way as gethostbyname. This seems obviously correct to me. ] (c) Application behaviour should not change but getaddrinfo should ] comply with rule 9. Applications should therefore not be changed ] to use getaddrinfo instead of gethostbyname. And yours, which seems like a version of (c) to me: ] (d) Applications should use getaddrinfo(), and if the ordering behaviour ] it uses is not desired, they should use an ordering that is desired. Is the ordering behaviour desired ? Obviously not. So you seem to be suggesting that the direct replacement of gethostbyname with getaddrinfo is wrong in this case. So how should curl be changed to use the desired (DNS round robin, or equivalent) ordering ? What is special about curl ? I could replace curl with almost any other application in the argument above and come to the same conclusions. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
Anthony Towns writes (Re: glibc's getaddrinfo() sort order): As it happens I largely agree with that. I don't agree with making a decision to go against an IETF standard and glibc upstream lightly, though, no matter how many caps Ian expends repeating that it's at the least mature level of Internet standard. Firstly: the STANDARD BEHAVIOUR FOR IPV4 IS THAT IMPLEMENTED BY GETHOSTBYNAME. I wonder how familiar you are with Internet protocol standardisation, and the IETF ? The purely document-oriented and de jure approach your taking doesn't seem to match actual Internet practice very well. Internet standards are living documents describing an evolving network. It is well known that if you read the RFCs as your only source of guideance for implementation you will go badly wrong. Making reference to an RFC which contradicts long-established existing behaviour is rather beside the point. Secondly: RFC3484 mandates that all applications should change, even those using gethostbyname. (You have completely ignored this point.) Thirdly: I'm not saying we should make this decision lightly. Saying we shouldn't go against ... lightly is just weasel-words. Is this discussion [going] against ... lightly ? No, of course not. What that argument would really be if you had any confidence in it would be shouldn't go against ... at all - but of course that's absurd. I would like to expand on this point about standards. Slavish adherence to standards, or to the views of mistaken upstreams, is a generally a mistake. This is particularly the case for the Debian Technical Committee. The TC's job is to decide what the correct behaviour is, by considering the technical merits. The TC's job is not to interpret standards documents. (Indeed, within our jurisdiction, our job includes changing them if we disagree with them.) Obviously we need to use standards documents to help understand the behaviour of the actual computing systems, to understand what is expected of our systems and what responses other systems are likely to produce. If we find ourself in clear disagreement with a standard we ought to ask ourselves whether we're sure we really understand the situation fully. As the implementor of a DNS resolver library, a past IETF participant, a DNS administrator, and someone who's followed some of the IPv6 transition work, I'm convinced I have that understanding. If you feel you don't have that understanding them please ask the questions which would help you gain it. I think we should be able to answer them. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On Wed, Sep 19, 2007 at 05:08:25AM +1000, Anthony Towns wrote: On Tue, Sep 18, 2007 at 07:18:40PM +0100, Ian Jackson wrote: There are only three possibilities: (a) It is correct that the behaviour of applications (and hence of hosts) should be changed to comply with rule 9. (b) Application behaviour should not change; getaddrinfo should behave the same way as gethostbyname. (c) Application behaviour should not change but getaddrinfo should comply with rule 9. Applications should therefore not be changed to use getaddrinfo instead of gethostbyname. No, there aren't. A fourth possibility is: (d) Applications should use getaddrinfo(), and if the ordering behaviour it uses is not desired, they should use an ordering that is desired. The ordering rules affect all DNS queries, and the topography of the IPv4 Internet is such that we know this ordering is going to be a wash in the general case, give pessimal behavior in a subset of cases, and break the utility of round-robin DNS in the majority of cases where the nodes aren't all hosted in the same IP assignment. So do you have a use case where you think the behavior described in rule 9 *is* desirable? Even if you do have one, I still don't see any reason to think this is a reasonable default behavior on the real-world Internet. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. [EMAIL PROTECTED] http://www.debian.org/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote: So do you have a use case where you think the behavior described in rule 9 *is* desirable? Any application written assuming this behaviour, works correctly on Windows, Solaris, *BSD and glibc based systems in general, but not on Debian. In the bug log, Pierre reported this behaviour is already supported on most of those sytems: ] On that matter, according to Aurelien, Vista (maybe XP), ] {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X ] and solaris come to mind). So it's kind of a decision of Debian vs. the ] rest of the world. And if I don't really care about the issue of the ] decision technically, this aspect worries me. Hrm, I see RFC5014 (from this month) provides some socket options for changing the way RFC3484 source address selection works, and envisages the possibility of doing the same for destination address selection. It assumes prefix matching is undertaken in getaddrinfo in order to achieve one of its aims. Even if you do have one, I still don't see any reason to think this is a reasonable default behavior on the real-world Internet. As it happens I largely agree with that. I don't agree with making a decision to go against an IETF standard and glibc upstream lightly, though, no matter how many caps Ian expends repeating that it's at the least mature level of Internet standard. If it's also the case that the RFC-specified behaviour is a de facto standard amongst other OSes, as the above seems to indicate, then that's even more reason to make sure we have a clear decision backed up by good, clear reasoning. Cheers, aj signature.asc Description: Digital signature
Re: glibc's getaddrinfo() sort order
On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote: glibc is the only implementation I know of that does this. I have heard, though not confirmed first-hand, that modern versions of FreeBSD, Windows, and Solaris do as well. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
Anthony Towns writes (Re: glibc's getaddrinfo() sort order): I'm not familiar with how getaddrinfo() has been implemented in the past I think this is an important point. If you're not familiar with the history then perhaps I can help explain. hostname-to-address lookups have up to recently generally been done with gethostbyname. The addresses from gethostbyname are ordered as they were returned by the nameserver (unless special configuration is made locally to override this, which is rarely done). When multiple addresses are available for a single lookup, special code in all widely-deployed nameservers arranges to rotate or round-robin the returned addresses: each enquirer gets a new ordering. This is so that a single service name can be made to refer to a number of physical network interfaces (perhaps on different hosts) and the load shared across them. This is known as DNS based load balancing. If the protocol is one like mail, where callers can be expected to try multiple addreses if the first doesn't work, this gives you a failover as well. So far so good. (For clarify, it is the above round-robin functionality that I am arguing ought to be preserved.) gethostbyname can theoretically support IPv6 but it can only return one address type per call. While there is a way to embed an IPv4 address in an IPv6 address, for circumstances like these, there is no clear way to tell gethostbyname that the calling application (and the rest of the stack on which the application relies) will cope with getting a pile of AF_INET6 back rather than AF_INET. Therefore for IPv6, a new interface was needed. The interface (defined in RFC3493 s6.1 and its predecessors) is getaddrinfo. It has several new features most of which aren't relevant here. The critical new feature is this: getaddrinfo allows the application to specify whether it wants to get only IPv4 addresses or IPv6 addresses as well, and if getting mixed addresses, whether to encode then as AF_INET or as `v6-mapped' AF_INET6 (ie, the 32 bits of IPv4 address padded with a specific prefix to make up an IPv6 address, where the prefix means no actually this is not an IPv6 address but an IPv4 address and should be used with IPv4). Combined with various other new facilities, this makes it reasonably straightforward to convert an IPv4-only application to be IPv6-capable. So, in summary: getaddrinfo is intended to replace gethostbyname. However, additionally, it was realised that if getaddrinfo can return a mixture of IPv4 and v6 addresses it was necessary to specify in what order they ought to be returned. When RFC3484 was written its authors evidently felt that the best way to do this was to define a comparison function over all addresses, which would define which address was to be preferred. Heedless of the effect on the DNS round-robin functionality I describe above, the authors of RFC3484 specified (s6 rule 9) that all addresses should be sorted by proximity to the host making the choice - where proximity is defined as the length of the common initial address prefix. This may have been a disputed but arguable definition of real network proximity for IPv6 in at the time 3484 was written. But it is clear now that it is not such a measure in the real IPv6 internet, and it has never been such a measure in the IPv4 internet. So RFC3484 s6 rule 9 is just wrong, because the reasons behind it do not apply any more if they ever did. However, it's worse than that: rule 9 is trying to change the behaviour of existing systems. If we agree with rule 9 it ought to apply just as well to applications using gethostbyname. All existing applications using gethostbyname are not in compliance with rule 9. It would perhaps be possible to modify gethostbyname to sort addresses according to RFC3484 s5 and s6. But would it be a good idea ? No, obviously not. It would change the behaviour of all of the applications which currently use gethostbyname. Currently such applications pick addresses at random (according to the DNS round robin). Rule 9 would have applications pick them according to longest-common-prefix. This would destroy the DNS based load balancing arrangements. What about getaddrinfo ? Well, there is no reason why a change in API (to add additional richness needed for new functionality) should so radically change the behaviour. And indeed, we see that indeed the DNS load balancing of our own servers has been broken by this change ! That is, applications are changed from using non-rule-9 gethostbyname to rule-9 getaddrinfo, and the servers experience wildly unbalanced load and break. The RFC tries to make getaddrinfo return a predictable ordering in the face of random orderings from DNS. That seems a perfectly reasonable way to define a function in the abstract; though certainly the ordering it comes up with can be criticised. It is not reasonable for the RFC to attempt to specify that the addresses be returned in a predictable
Re: glibc's getaddrinfo() sort order
On Tue, Sep 18, 2007 at 03:33:51PM +0100, Ian Jackson wrote: Anthony Towns writes (Re: glibc's getaddrinfo() sort order): I'm not familiar with how getaddrinfo() has been implemented in the past I think this is an important point. If you're not familiar with the history then perhaps I can help explain. hostname-to-address lookups have up to recently generally been done with gethostbyname. Right, gethostbyname I am familiar with (along with the corresponding DNS round-robin behaviour), and changing its behaviour is certainly unreasonable. [...] So far so good. (For clarify, it is the above round-robin functionality that I am arguing ought to be preserved.) [...] However, additionally, it was realised that if getaddrinfo can return a mixture of IPv4 and v6 addresses it was necessary to specify in what order they ought to be returned. When RFC3484 was written its authors evidently felt that the best way to do this was to define a comparison function over all addresses, which would define which address was to be preferred. Heedless of the effect on the DNS round-robin functionality I describe above, the authors of RFC3484 specified (s6 rule 9) that all addresses should be sorted by proximity to the host making the choice - where proximity is defined as the length of the common initial address prefix. So if getaddrinfo() has always behaved in this way, I don't see a great deal of justification in changing it. The bug log indicated that there were pre-rfc implementations of getaddrinfo() that behaved more like gethostbyname() at least wrt round-robin DNS; but I've got no way of verifying that. This may have been a disputed but arguable definition of real network proximity for IPv6 in at the time 3484 was written. But it is clear now that it is not such a measure in the real IPv6 internet, and it has never been such a measure in the IPv4 internet. I hadn't seen any indication it was disputed for IPv6 prior to your mail. The patch in glibc only affected IPv4 addresses, for that matter. So RFC3484 s6 rule 9 is just wrong, because the reasons behind it do not apply any more if they ever did. To give an analogy to the lines I'm thinking along: the definition of tm_year in the tm struct in time.h is wrong, years since 1900 should be years since 0 AD, but the spec says otherwise, so programs simply need to deal with that historical craziness. That's not quite the same here, in that the spec does (by my reading) explicitly allow implementors to not behave in that way, but if you're coding to the spec you certainly can't rely on DNS round-robin being passed through an invocation of getaddrinfo(). However, it's worse than that: rule 9 is trying to change the behaviour of existing systems. If we agree with rule 9 it ought to apply just as well to applications using gethostbyname. All existing applications using gethostbyname are not in compliance with rule 9. The RFC specifies the behaviour of getaddrinfo(), not gethostbyname(), so doesn't affect any apps that solely use gethostbyname(). So no, it shouldn't be applied to other functions anymore than the definition of tm_year should mean we count from 1900 in every year related function. I think we can safely say that Rule 9 isn't useful for IPv4 addresses. I'm not sure that's true or not for IPv6 addresses -- it certainly seems an inappropriately hierarchial way of viewing a network that's connected much more ... fluidly than that, at any rate. But even if Rule 9 is completely useless and counterproductive, it's still the standard for that function, which, afaics, we should be meeting. What about getaddrinfo ? Well, there is no reason why a change in API (to add additional richness needed for new functionality) should so radically change the behaviour. Agreed in principle, but this is a rule the RFC should've followed; since they haven't, I'm not convinced we should. It is not reasonable for the RFC to attempt to specify that the addresses be returned in a predictable ordering when the established behaviour, relied on throughout the internet for decades, has been that the addresses are _not_ returned in a predictable order. Again, I agree with that, but the RFC *has* done that. I'd say it's more important that getaddrinfo() on Debian behave the same as on other operating systems, than that it behave in the same way as other functions. I can only take the RFC's assertion as to getaddrinfo()'s proper behaviour though; I don't have a more direct idea how getaddrinfo() behaves in previous versions of Debian, other Linux distros, other libcs, Windows, etc. This argument is an argument for accepting any crap that comes out of glibc upstream. No, it's an argument for accepting any crap that comes out of the Internet standards process. :-/ As I have demonstrated above, the RFC is wrong, inconsistent with existing practice, It's certainly inconsistent with gethostbyname()'s existing
Re: glibc's getaddrinfo() sort order
On Wed, Sep 19, 2007 at 03:03:51AM +1000, Anthony Towns wrote: Heedless of the effect on the DNS round-robin functionality I describe above, the authors of RFC3484 specified (s6 rule 9) that all addresses should be sorted by proximity to the host making the choice - where proximity is defined as the length of the common initial address prefix. So if getaddrinfo() has always behaved in this way, I don't see a great deal of justification in changing it. The bug log indicated that there were pre-rfc implementations of getaddrinfo() that behaved more like gethostbyname() at least wrt round-robin DNS; but I've got no way of verifying that. glibc is the only implementation I know of that does this. I've attached a small test program. The results are: sarge: libc6 2.3.2.ds1-22sarge5: random order etch: libc6 2.3.6.ds1-13etch2: ordered results On other implementations I'm aware of is in libbind. You'll need to run configure with the --enable-libbind for that. It doesn't reorder it. I don't know of any of the other libcs in debian actually provide getaddrinfo(), but I doubt they'll reorder. There are also lots of applications that have a wrapper around gethostbyname() in case the libc doesn't provide it. It's highly unlikely any of those will do any reordering. This may have been a disputed but arguable definition of real network proximity for IPv6 in at the time 3484 was written. But it is clear now that it is not such a measure in the real IPv6 internet, and it has never been such a measure in the IPv4 internet. I hadn't seen any indication it was disputed for IPv6 prior to your mail. The patch in glibc only affected IPv4 addresses, for that matter. I've also stated that it might not work properly for IPv6. It's likely that something in the same /32 is close network wise, it's even more likely for /48 and /64, but you probably don't want to go below the /32. Kurt -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote: I've attached a small test program. The results are: sarge: libc6 2.3.2.ds1-22sarge5: random order etch: libc6 2.3.6.ds1-13etch2: ordered results Maybe I should attach it. Kurt #include sys/types.h #include sys/socket.h #include arpa/inet.h #include netdb.h #include stdio.h int main() { struct addrinfo *res, *p, hints; hints.ai_flags = 0; hints.ai_family = PF_UNSPEC; hints.ai_socktype = SOCK_DGRAM; hints.ai_protocol = 0; hints.ai_addrlen = 0; hints.ai_addr = NULL; hints.ai_canonname = NULL; hints.ai_next = NULL; getaddrinfo(0.pool.ntp.org, ntp, hints, res); for (p = res; p; p = p-ai_next) { if (p-ai_family == AF_INET) { char ip[INET_ADDRSTRLEN]; if (inet_ntop(p-ai_family, (*(struct sockaddr_in *)p-ai_addr).sin_addr, ip, sizeof(ip)) != NULL) { printf(%s\n, ip); } } } freeaddrinfo(res); return 0; }
Re: glibc's getaddrinfo() sort order
Anthony Towns writes (Re: glibc's getaddrinfo() sort order): So if getaddrinfo() has always behaved in this way, I don't see a great deal of justification in changing it. The bug log indicated that there were pre-rfc implementations of getaddrinfo() that behaved more like gethostbyname() at least wrt round-robin DNS; but I've got no way of verifying that. I don't know whether or not there were previous versions of getaddrinfo with the same behaviour as gethostbyname, but that is the wrong way of looking at it. getaddrinfo wasn't in widespread use until the recent efforts to support IPv6. Did you miss the bits where I said that * getaddrinfo is supposed to replace gethostbyname * applications are being changed t call getaddrinfo instead of gethostbyname ? There are only three possibilities: (a) It is correct that the behaviour of applications (and hence of hosts) should be changed to comply with rule 9. (b) Application behaviour should not change; getaddrinfo should behave the same way as gethostbyname. (c) Application behaviour should not change but getaddrinfo should comply with rule 9. Applications should therefore not be changed to use getaddrinfo instead of gethostbyname. Which of these are you proposing ? RFC3484 says (a) but is wrong for the reasons I have explained. (b) is my view. (c) is obviously unreasonable. Anthony Towns writes (Re: glibc's getaddrinfo() sort order): All existing applications using gethostbyname are not in compliance with rule 9. The RFC specifies the behaviour of getaddrinfo(), not gethostbyname(), Nonsense. It doesn't specify the behaviour of any such API at all. RFCs like this one specify the behaviour of _hosts_. That is, it specifies what kind of packets the host should emit and accept, on what interfaces. There is nothing in RFC3484 that limits its application to getaddrinfo rather than gethostbyname. There is discussion in s8 which suggests some possible behaviours of getaddrinfo as an `implementation strategy' for RFC3484 - but note that our getaddrinfo doesn't do what s8 suggests (because s8 is barking mad). If you agree with RFC3484 s8 then you ought to conclude that similar changes ought to be made to other internal interfaces which do the same job as getaddrinfo. so doesn't affect any apps that solely use gethostbyname(). So no, it shouldn't be applied to other functions anymore than the definition of tm_year should mean we count from 1900 in every year related function. This business about tm_year is a complete red herring. In fact, you've got my argument completely backwards. If someone wrote in a standards document that tm_year should be zero at 0AD (whatever that means) rather than 1900AD, what should we do ? Well, the answer would be obvious: we should continue to do what we have done forever, so as not to change the meaning of existing infrastructure: zero at 1900AD. This is what RFC3484 s6 is doing. It is trying to change the meaning of existing deployments of multiple IPv4 addresses in the global DNS. I think we can safely say that Rule 9 isn't useful for IPv4 addresses. Are you happy then that we should mandate that the Debian libc maintainer should change our libc accordingly ? I'm not sure that's true or not for IPv6 addresses -- it certainly seems an inappropriately hierarchial way of viewing a network that's connected much more ... fluidly than that, at any rate. But even if Rule 9 is completely useless and counterproductive, it's still the standard for that function, which, afaics, we should be meeting. It is NOT THE STANDARD as I have previously pointed out. An IETF working group proposed that it ought to become the standard but 1. the standard has not advanced further 2. that was in a time when IPv6 addressing structure was understood very differently. To justify my point 2, that RFC3484 predates substantial changes in the IPv6 addressing architecture: Site-local addresses are one of the key features that motivates the rules in RFC3484. These were deprecated by RFC3879 (status: PROPOSED) and this was confirmed in RFC4291 (status: DRAFT). (The standards track goes PROPOSED - DRAFT - STANDARD.) DNS for IPv6 was originally intended to be supported with A6, DNAME and bitstring labels according to RFC2874. This was originally Standards Track and was designed to support rapid and continuous renumbering. With the publication of RFC3363 (s1.1) and supported by the arguments in RFC3364, 2874 was moved to EXPERIMENTAL (ie, off the Standards Track), because rapid and continuous renumbering is no longer planned. Ie, the addressing and numbering arrangements for IPv6 have changed significantly since 3484 was written. That could well be why 3484 hasn't progressed. What about getaddrinfo ? Well, there is no reason why a change in API (to add additional richness needed for new functionality) should so radically change the behaviour. Agreed in principle, but this is a rule
Re: glibc's getaddrinfo() sort order
On Tue, Sep 18, 2007 at 07:18:40PM +0100, Ian Jackson wrote: There are only three possibilities: (a) It is correct that the behaviour of applications (and hence of hosts) should be changed to comply with rule 9. (b) Application behaviour should not change; getaddrinfo should behave the same way as gethostbyname. (c) Application behaviour should not change but getaddrinfo should comply with rule 9. Applications should therefore not be changed to use getaddrinfo instead of gethostbyname. No, there aren't. A fourth possibility is: (d) Applications should use getaddrinfo(), and if the ordering behaviour it uses is not desired, they should use an ordering that is desired. Since we're at the point where you're yelling at me about how I'm not listening, I won't reply further. Cheers, aj signature.asc Description: Digital signature
Re: glibc's getaddrinfo() sort order
On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote: On Wed, Sep 19, 2007 at 03:03:51AM +1000, Anthony Towns wrote: So if getaddrinfo() has always behaved in this way, I don't see a great deal of justification in changing it. [...] glibc is the only implementation I know of that does this. Windows implementations would seem like the other candidate, given the Microsoft Research at the top of that RFC. Cheers, aj signature.asc Description: Digital signature
Re: glibc's getaddrinfo() sort order
* Ian Jackson ([EMAIL PROTECTED]) [070918 16:35]: So RFC3484 s6 rule 9 is just wrong, because the reasons behind it do not apply any more if they ever did. I have some stanza from the dns-operations list: http://lists.oarci.net/pipermail/dns-operations/2007-September/002028.html | Either it [RFC3484] should be corrected or declared Historic. Denic is the german domainnames authority, and they usually know what they do (especially Peter Koch does). Cheers, Andi -- http://home.arcor.de/andreas-barth/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On Thu, Sep 13, 2007 at 12:14:09AM +, Anthony Towns wrote: On Thu, Sep 13, 2007 at 12:06:40AM +0100, Ian Jackson wrote: Does anyone have an answer to my point that application of rule 9 changes the long-established meaning of existing DNS data ? I'm not familiar with how getaddrinfo() has been implemented in the past -- but I think it makes more sense to look at the definition of the function than the data it's manipulating. The RFC tries to make getaddrinfo return a predictable ordering in the face of random orderings from DNS. That seems a perfectly reasonable way to define a function in the abstract; though certainly the ordering it comes up with can be criticised. I disagree with your answer to that first question. gethostbyname returns results in random order. getaddrinfo should do the same. I'd say it's more important that getaddrinfo() on Debian behave the same as on other operating systems, than that it behave in the same way as other functions. I can only take the RFC's assertion as to getaddrinfo()'s proper behaviour though; I don't have a more direct idea how getaddrinfo() behaves in previous versions of Debian, other Linux distros, other libcs, Windows, etc. Our tests shows that windows XP since SP1 (or 2 ?), vista, various recent BSD, and now glibc 2.6 (or 2.5 I don't remember when it was introduced) all behave this way. I've no access to macos X, but I wouldn't be surprised it works the same. Another interesting hint would be to test on solaris too. -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpY7UGvlzQJV.pgp Description: PGP signature
Re: glibc's getaddrinfo() sort order
Anthony Towns writes (Re: glibc's getaddrinfo() sort order): On Fri, Sep 07, 2007 at 01:06:06AM +0200, Kurt Roeckx wrote: It's atleast in the spirit of the rfc to prefer one that's on the local network. It might be the intention of rule 9, but then rule 9 isn't very well written. Rule 9 seems perfectly well written, it just does something you (reasonably) consider undesirable. Should I take that as agreement with Steve's and my view, that we should by default not apply rule 9 to IPv4 ? Your opinion seems unclear to me. We haven't heard from the rest of the committee. Does anyone have an answer to my point that application of rule 9 changes the long-established meaning of existing DNS data ? (In ways, I would add, which have proven to cause significant operational problems in practice.) As I say, I think that point is unanswerable and leads inevitably to the conclusion that we should disable this behaviour by default. The rest of your (AJ's) mail seems to be getting bogged down a bit. I'll try to answer what I see as the key aspects. In addition, I think there's two different aspects here: the first is should getaddrinfo() return results in random order to aid in load distribution? and the second is is prefix matching a reasonable way to determine a good host to use? I disagree with your answer to that first question. gethostbyname returns results in random order. getaddrinfo should do the same. (random isn't quite true but it's true enough in the usual case.) AFAICS, the answer to the first question is simply no, it shouldn't -- randomised load balancing like that needs to be done at the application level, You are mistaken. Randomised load balancing like that is _already done_ using multiple IPv4 addresses in the DNS. It has been done this way for nearly two decades. [stuff] Doing it by changing Rule 9 to: I don't think this kind of complexity is warranted here. Even if it were, you seem to be proposing a strategy which depends on guessing whether communication with a particular destination address would involve NAT, which would be fragile. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On Thu, Sep 13, 2007 at 12:06:40AM +0100, Ian Jackson wrote: Does anyone have an answer to my point that application of rule 9 changes the long-established meaning of existing DNS data ? I'm not familiar with how getaddrinfo() has been implemented in the past -- but I think it makes more sense to look at the definition of the function than the data it's manipulating. The RFC tries to make getaddrinfo return a predictable ordering in the face of random orderings from DNS. That seems a perfectly reasonable way to define a function in the abstract; though certainly the ordering it comes up with can be criticised. I disagree with your answer to that first question. gethostbyname returns results in random order. getaddrinfo should do the same. I'd say it's more important that getaddrinfo() on Debian behave the same as on other operating systems, than that it behave in the same way as other functions. I can only take the RFC's assertion as to getaddrinfo()'s proper behaviour though; I don't have a more direct idea how getaddrinfo() behaves in previous versions of Debian, other Linux distros, other libcs, Windows, etc. AFAICS, the answer to the first question is simply no, it shouldn't -- randomised load balancing like that needs to be done at the application level, You are mistaken. [...] What getaddrinfo() should and shouldn't do is defined by the standard, not by what would be most useful. :-/ FWIW, if the standard should be changed, it seems to me that it'd carry more weight having the Debian tech ctte put that recommendation in than a random DD. Cheers, aj signature.asc Description: Digital signature
Re: glibc's getaddrinfo() sort order
I concur with all of Ian's comments, and in particular I would also like to encourage Kurt to champion this issue to the IETF working group. My own past experiences suggest that glibc upstream is willing to hide behind standards not only when they mandate undesirable behavior but also when they fail to /prohibit/ undesirable behavior, so it would be nice to have a solution that in the long term doesn't require the Debian glibc maintainers to diverge from upstream in order to comply with a ruling of the TC. I would also underscore the additional reason Kurt has pointed out for why RFC3484 section 6 rule 9 is inappropriate for IPv4 networks, even in the absence of NAT. Over the years, the IPv4 address space has become extremely fragmented, in large part due to an incomplete understanding of the long-term significance of early stewardship policies. As an example, by 2003 the ISP I was working for had network allocations in each of 206.x.x.x, 208.x.x.x, and 64.x.x.x, and have since picked up netblocks in 216.x.x.x and 63.x.x.x. While some of these netblocks do share common prefixes, the common prefixes are so short that they aren't even specific to North America[1], and some of the netblocks are far enough apart that rule 9 would give precedence to half the planet over the router down the hall. Rule 9 follows naturally from IPv6 allocation policies which have been crafted in direct response to the experiences with IPv4 with the intent of minimizing address space fragmentation. In IPv6, 64 bits of the address are host bits, and another 16 bits of the prefix denote local networks, with the remaining 48 bits corresponding fairly well with network topology. This rule is therefore a sensible default for IPv6, but for IPv4 it easily results in pessimal behavior and should not be a default. Cheers, -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. [EMAIL PROTECTED] http://www.debian.org/ [1] http://xkcd.com/195/ :-) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
Kurt Roeckx writes (Re: glibc's getaddrinfo() sort order): It's atleast in the spirit of the rfc to prefer one that's on the local network. It might be the intention of rule 9, but then rule 9 isn't very well written. I agree that applying RFC3484 section 6 rule 9 to IPv4 addresses is a mistake and that therefore we should change the default in Debian accordingly. I would encourage Kurt to take this matter up with the relevant IETF working group. Others have already written about problems involving NAT. I agree with this argument (although I don't approve of NAT and it galls me to use some braindamage involving NAT as an argument for anything). However there is another argument I would like to make: A host using getaddrinfo configured to apply rule 9 to IPv4 addresses will behave quite differently to a host using gethostbyname. I think that this change in behaviour is unwarranted. Whether an application uses gethostbyname or getaddrinfo is an implementation detail (related closely to whether that particular application's source code has been modified to try to support IPv6) and this should not change the behaviour. Presently when connecting to a service offering only IPv4 addresses, most hosts will use gethostbyname and use the addresses offered in round-robin DNS order. That is to say, the meaning (pre-RFC3484, and current de-facto) of a DNS RRset containing several IP addresses is that the addresses should be tried `uniformly at random' by callers, as done by the nameserver round-robin RRset rotation algorithm. RFC3484 section 6 rule 9 applied to IPv4 appears to be an attempt to change that meaning. This interpretation of rule 9 for IPv4 as an attempt to change the meaning of existing deployed DNS RRsets is supported by the fact that proponents of rule 9 for IPv4 claim that it will fix existing problems, as in http://udrepper.livejournal.com/16116.html. However, it is obviously wrongheaded to attempt to change the defined meaning of all existing multi-record A RRsets. On the existing Internet, zone administrators use multi-record A RRsets in the knowledge that those RRsets will be used by callers in an evenly-distributed round-robin fashion as currently implemented by bind and gethostbyname. This meaning for multiple A records had been established for well over a decade by the time 3848 was written and in the intervening years it has continued to be dominant. New systems, and systems newly modified to support IPv6, should continue to interpret existing A RRsets in the same way as before. A few cursory web searches show that this new behaviour of getaddrinfo is indeed causing trouble as applications are converted to IPv6 and the change in behaviour with IPv4 is found to be undesirable. Finally, I would like to preemptively address the line but this is an RFC and we must do what it says. There are two responses: The most obvious one is that RFC3484 is merely Proposed Standard. At this stage of the standardisation process one can expect to find errors, mistaken deviations from existing practice, and so on. (The IETF standardisation process has been broken so that documents often get stuck in this state; but that doesn't mean that we should treat draft documents as if they were gospel, let alone documents that aren't even drafts.) The second is a more general point: if a standards document tells us to do something which is wrong, then we should not do it. Obviously we should think fairly hard before making the decision to go against a standard, but our job is to do the right thing and standards documents are there to help us not to constrain us. I think my argument above about the existing meaning of multiple A records is irrefutable. I already suggested that maybe rule 9 should be limited to the common prefix length of the netmask you're using. An other option is that you extend rule 2 to have the same behaviour with ipv4, and that 10/8, 172.16/12 and 192.168/16 should be considered organization-local. Replacing rule 9 with something more limited based on local network interfaces (ie, prefer what appear to be locally-attached addresses) would be fine. Or a default based on routing metrics would be fine too. (Although I think these may be too much work to do in getaddrinfo.) The problem occurs when we start ranking IPv4 addresses of foreign systems about we have no special knowledge of the topology. Ranking RFC1918 addresses ahead of others is not entirely a safe thing to do because people sometimes foolishly publish RFC1918 addresses for public services and expect callers to skip those addresses somehow. But at least it wouldn't break people who weren't already doing wrong things. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote: On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote: Pierre Habouzit wrote: Also note that probably many many Windows machines work that way (the RFC was written by a MS guy). And this behaviour impacts software developpers, and people that hoped that having multiple A records for their service will see a perfect round robin will be stuck anyways. I mean, it's non previous-practice-backward-compliant and one can argue reasonably it sucks. But hel-llooo ! this kind of design choice is not only local. If every one (or the majority) on the internet behaves like this, fixing this bug (if it is really one) in Debian will _not_, I say _not_ prevent us from fixing many software that rely on DNS round robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will have to cope with that whatever choice is made. On that matter, according to Aurélien, Vista (maybe XP), {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X and solaris come to mind). So it's kind of a decision of Debian vs. the rest of the world. And if I don't really care about the issue of the decision technically, this aspect worries me. -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpCalQfmLsOJ.pgp Description: PGP signature
Re: glibc's getaddrinfo() sort order
On Fri, Sep 07, 2007 at 01:06:06AM +0200, Kurt Roeckx wrote: It's atleast in the spirit of the rfc to prefer one that's on the local network. It might be the intention of rule 9, but then rule 9 isn't very well written. Rule 9 seems perfectly well written, it just does something you (reasonably) consider undesirable. The RFC says: ] Rule 9: Use longest matching prefix. ] When DA and DB belong to the same address family (both are IPv6 or ] both are IPv4): If CommonPrefixLen(DA, Source(DA)) ] CommonPrefixLen(DB, Source(DB)), then prefer DA. Similarly, if ] CommonPrefixLen(DA, Source(DA)) CommonPrefixLen(DB, Source(DB)), ] then prefer DB. ] ] Rule 10: Otherwise, leave the order unchanged. ] If DA preceded DB in the original list, prefer DA. Otherwise prefer ] DB. ] ] Rules 9 and 10 may be superseded if the implementation has other ] means of sorting destination addresses. For example, if the ] implementation somehow knows which destination addresses will result ] in the best communications performance. The admin says that rule 9 isn't appropriate seems to fit somehow knows which destination address will result in the best communications performance, so afaict, the description in the new gai.conf, # sortv4 yes|no #If set to no, getaddrinfo(3) will ignore IPv4 adresses in rule 9. See #section 6 in RFC 3484. The default is yes. Setting this option to #no breaks conformance to RFC 3484. is incorrect, in that that the implementation is still in conformance with the RFC. In addition, I think there's two different aspects here: the first is should getaddrinfo() return results in random order to aid in load distribution? and the second is is prefix matching a reasonable way to determine a good host to use? AFAICS, the answer to the first question is simply no, it shouldn't -- randomised load balancing like that needs to be done at the application level, or by giving different sets of IPs in response to DNS queries by different hosts, such as using BGP or similar. As far as pool.ntp.org is concerned, that looks like the end of the story, afaics: ntp can't rely in getaddrinfo to give a suitably random answer. OTOH, getaddrinfo is meant to give a close answer, and doing prefix matching on NATed addresses isn't the Right Thing. For IPv6, that's fine because it's handled by earlier scoping rules. For NATed IPv4 though the prefix we should be using is whatever the host is going to be NATed *to*. And that would imply that the Right Thing would be to have an option more like: pretend-that 10/8 is-really 1.2.3.4/32 That doesn't seem likely to work though because it requires extra manual configuration, which won't happen. Giving up on actually getting getaddrinfo to give close answers for NATed boxes leaves the option of trying to avoid getaddrinfo going out of its way to give far answers instead, which would mean turning off prefix-matching for NATed boxes; which could be done by ignoring rule 9 by default for private IPv4 addresses. Actually, it might also be reasonable to ignore rule 9 if scope(DA) scope(source(DA)) and scope(DB) scope(source(DB)) which seems reasonably equivalent to DA and DB are only reachable through a NAT for both IPv4 and IPv6. The corner case is if the destination is in a DMZ and can access both the Internet and local boxes directly, but I don't think you can get the right answer for that atm anyway. Doing it by changing Rule 9 to: Rule 9: Use longest matching prefix. When DA and DB belong to the same address family (both are IPv6 or both are IPv4): If xCommonPrefixLen(DA, Source(DA)) xCommonPrefixLen(DB, Source(DB)), then prefer DA. Similarly, if xCommonPrefixLen(DA, Source(DA)) xCommonPrefixLen(DB, Source(DB)), then prefer DB. If scope(X) scope(Y) then xCommonPrefixLen(X,Y) = 0 Else: xCommonPrefixLen(X,Y) = CommonPrefixLen(X,Y) would give reasonable behaviour, I think (preferring addresses that can be reached without NAT first, then leaving addresses that require NAT in the order received). In essence, the problem is that comparing prefixes of real addresses against addresses that will be NATed is not adding information, and is possibly losing information -- eg, if your site DNS already orders A addresses by prefix matching on your actual IP range. I already suggested that maybe rule 9 should be limited to the common prefix length of the netmask you're using. An other option is that you extend rule 2 to have the same behaviour with ipv4, and that 10/8, 172.16/12 and 192.168/16 should be considered organization-local. Those are specified as having site-local scope in 3.2; but Rule 2 only comes into play if one of the IPs returned by the nameserver is also site-local anyway which isn't particularly useful. Cheers, aj signature.asc Description: Digital signature
Re: glibc's getaddrinfo() sort order
On ven, sep 07, 2007 at 07:45:52 +, Pierre Habouzit wrote: On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote: On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote: Pierre Habouzit wrote: Also note that probably many many Windows machines work that way (the RFC was written by a MS guy). And this behaviour impacts software developpers, and people that hoped that having multiple A records for their service will see a perfect round robin will be stuck anyways. I mean, it's non previous-practice-backward-compliant and one can argue reasonably it sucks. But hel-llooo ! this kind of design choice is not only local. If every one (or the majority) on the internet behaves like this, fixing this bug (if it is really one) in Debian will _not_, I say _not_ prevent us from fixing many software that rely on DNS round robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will have to cope with that whatever choice is made. On that matter, according to Aurélien, Vista (maybe XP), {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X and solaris come to mind). So it's kind of a decision of Debian vs. the rest of the world. And if I don't really care about the issue of the decision technically, this aspect worries me. Still one technical point, here is the excerpt from the RFC on the offending rule: Rule 9: Use longest matching prefix. When DA and DB belong to the same address family (both are IPv6 or both are IPv4): If CommonPrefixLen(DA, Source(DA)) CommonPrefixLen(DB, Source(DB)), then prefer DA. Similarly, if CommonPrefixLen(DA, Source(DA)) CommonPrefixLen(DB, Source(DB)), then prefer DB. What it means is that for IPs with the same common prefix, the order of the address is unchanged wrt how it came up in the DNS answer. What it means, is that when I use apt to fetch from ftp.debian.org from my home ISP (proxad) it takes the mirror that proxad does (ftp.fr.d.o). When I go to my parents, using wanadoo (now Orange), it picks the Oleane one (ftp.fr2.d.o) which indeed is nearer. It makes completely sense. And as per rule of the common prefix, on a local network, RR still can be assumed on a given VLAN. It actually makes quite some sense to me. Maybe that's why Joey Hess had variability: the RFC does not specify a *full* ordering, it just aim to restrict the RR to the nearest servers to the client. Of course, usualy ISP IP's have first octet smaller than 127, so if you host a service with RR on a network with the first octet greater than 128 and a mirror on an IP with a first octet smaller than 128, the client of your service from the ISP will never chose the former because of this rule. This is a RFC that favors people with large mirroring networks for their service, and hinders people with small mirroring networks because they have to chose the IP for their network servers with care. I think I've described everything important for the Ctte to rule this, so unless a question pop up, I'll let you rule in peace :) -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpfigLMJUEPw.pgp Description: PGP signature
Re: glibc's getaddrinfo() sort order
On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote: Pierre Habouzit wrote: The point is, there is an RFC, and we put a patch so that admins can disable it using gai.conf. There is an RFC is not always a good excuse for breaking existing systems. Admins can disable it is not a good argument when one common class of the breakage is all the systems that _don't_ disable it hammering systems that have round-robins set up to distribute load. More generally, we added an option so your bug is fixed is a common fallacy. The point is: the option is here, I don't really care if the Ctte decides to set true or false by default. My underlying point was just that the switch is easy for us. SO yeah, if we change the default option, the bug is definitely fixed and is anything but a fallacy. OTOH there is no way upstream will change that (Ulrich refused the patch with blatant aggressiveness), so every other distribution (Fedora and RedHat, probably many other) will work that way. So we can rule everything we want here (and I absolutely don't care about the issue of the decision, I was just giving some pointers to the pro's as I assumed that the cons were obvious to anyone), this will not change how upstream glibc works, so many people (probably a majority ?) will use this new scheme anyway. And I also say that knowing Uli, (and knowing how deeply I care about this issue ;p) I won't spend a minute trying to argue with Uli, I'm not insane, and don't have the man-years to do that. Also note that probably many many Windows machines work that way (the RFC was written by a MS guy). And this behaviour impacts software developpers, and people that hoped that having multiple A records for their service will see a perfect round robin will be stuck anyways. I mean, it's non previous-practice-backward-compliant and one can argue reasonably it sucks. But hel-llooo ! this kind of design choice is not only local. If every one (or the majority) on the internet behaves like this, fixing this bug (if it is really one) in Debian will _not_, I say _not_ prevent us from fixing many software that rely on DNS round robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will have to cope with that whatever choice is made. BTW, I'm seeing some programs that use getaddrinfo and still don't have the RFC 3484 sorting behavior. Is this controlled by the AI_ADDRCONFIG flag? TTBOMK it's a bug wrt intended behaviour as per upstream. -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpf1lydUiqh6.pgp Description: PGP signature
Re: glibc's getaddrinfo() sort order
On Fri, Sep 07, 2007 at 06:54:21PM +1000, Anthony Towns wrote: OTOH, getaddrinfo is meant to give a close answer, and doing prefix matching on NATed addresses isn't the Right Thing. For IPv6, that's fine because it's handled by earlier scoping rules. For NATed IPv4 though the prefix we should be using is whatever the host is going to be NATed *to*. And that would imply that the Right Thing would be to have an option more like: pretend-that 10/8 is-really 1.2.3.4/32 That doesn't seem likely to work though because it requires extra manual configuration, which won't happen. Giving up on actually getting getaddrinfo to give close answers for NATed boxes leaves the option of trying to avoid getaddrinfo going out of its way to give far answers instead, which would mean turning off prefix-matching for NATed boxes; which could be done by ignoring rule 9 by default for private IPv4 addresses. The problem with IPv4 is not only about NAT. It just happens to show the problem better. With the IPv6 allocation policies, it's likely that the more higher bits match, the closer it is network wise. It is rather unlikly in the IPv4 case, specially if you go above /16. Kurt -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
glibc's getaddrinfo() sort order
Hi, I'm not agreeing with the glibc maintainer(s) about wether getaddrinfo() should sort the results or not. I think the current way it sorts things does not work at all in IPv4, and I think it hurts more than it does good. I'm seeking input from the tech-ctte on how to handle this. Kurt signature.asc Description: Digital signature
Re: glibc's getaddrinfo() sort order
On Thu, Sep 06, 2007 at 10:04:23PM +, Kurt Roeckx wrote: Hi, I'm not agreeing with the glibc maintainer(s) about wether getaddrinfo() should sort the results or not. I think the current way it sorts things does not work at all in IPv4, and I think it hurts more than it does good. I'm seeking input from the tech-ctte on how to handle this. The point is, there is an RFC, and we put a patch so that admins can disable it using gai.conf. Note also that old calls like gethostbyname still return address randomly. It's somehow accepted that people using getaddrinfo should be aware of the RFC requirements wrt orderings, and that applications for whose DNS resolutions round robins may matter should implement their own randomization. the Ctte may want to read: - http://udrepper.livejournal.com/16116.html - http://people.redhat.com/drepper/linux-rfc3484.html -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpXomHNkLCZ5.pgp Description: PGP signature
Re: glibc's getaddrinfo() sort order
On Fri, Sep 07, 2007 at 12:34:10AM +0200, Pierre Habouzit wrote: the Ctte may want to read: - http://udrepper.livejournal.com/16116.html - http://people.redhat.com/drepper/linux-rfc3484.html The first one makes a point to which I party agree, but also disagree. It's atleast in the spirit of the rfc to prefer one that's on the local network. It might be the intention of rule 9, but then rule 9 isn't very well written. In the case the server has 2 addresses assigned, I doubt that you're going to advertise the local one outside. So you're atleast have a different response for an internal and external query. I don't see why the interal query should also return the external address. I already suggested that maybe rule 9 should be limited to the common prefix length of the netmask you're using. An other option is that you extend rule 2 to have the same behaviour with ipv4, and that 10/8, 172.16/12 and 192.168/16 should be considered organization-local. Ulrich Drepper actually called site-local in the second document, but I think organization-local would be the right scope for it. Kurt -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
Pierre Habouzit wrote: The point is, there is an RFC, and we put a patch so that admins can disable it using gai.conf. There is an RFC is not always a good excuse for breaking existing systems. Admins can disable it is not a good argument when one common class of the breakage is all the systems that _don't_ disable it hammering systems that have round-robins set up to distribute load. More generally, we added an option so your bug is fixed is a common fallacy. Note also that old calls like gethostbyname still return address randomly. It's somehow accepted that people using getaddrinfo should be aware of the RFC requirements wrt orderings, and that applications for whose DNS resolutions round robins may matter should implement their own randomization. getaddrinfo was around for many years before RFC 3484. It's been in glibc since 1996. So you're saying that developers writing code in the 90's should have somehow been aware of an RFC that was published in 2003. BTW, I'm seeing some programs that use getaddrinfo and still don't have the RFC 3484 sorting behavior. Is this controlled by the AI_ADDRCONFIG flag? -- see shy jo signature.asc Description: Digital signature