Re: glibc's getaddrinfo() sort order

2007-09-24 Thread Clint Adams
On Mon, Sep 24, 2007 at 11:18:00AM +0100, Ian Jackson wrote:
 COMMON BEHAVIOUR ON TODAY'S INTERNET IS THAT IMPLEMENTED BY
 GETHOSTBYNAME.

Common behavior for gethostbyname() on today's Internet is that
implemented commonly in gethostbyname() .

 How many times do I have to explain this ?  getaddrinfo is the
 REPLACEMENT FOR GETHOSTBYNAME.  It is not an interface which
 applications choose because they want different address sorting
 behaviour.  It is the interface applications MUST USE TO SUPPORT IPV6.

I don't think that this is true.  getipnodebyname() is an interface
applications can use and is much more conducive to drop-in replacement
given its interface.  I am not recommending use of this function, but
your leap of logic is strange.

 Changing applications to use getaddrinfo instead of gethostbyname is
 done BECAUSE THOSE APPLICATIONS ARE BEING UPDATED TO SUPPORT IPV6.

I think it's done because it's a better, more standardized interface.

 Updating an application to support IPv6 should not change the way it
 treats DNS RRsets containing multiple IPv4 addresses.  Obviously.

Anything assuming what you assume about DNS resolution is going to break
in the future eventually.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-23 Thread Steve Langasek
[In all my comments below, I am assuming that we are focused on rule 9 as
pertains to sorting of IPv4 addresses.  A strict sorting of IPv6 addresses
by length of prefix match is also questionable, but not so much so that I
believe overruling is justified.]

On Fri, Sep 21, 2007 at 01:07:49PM +1000, Anthony Towns wrote:
 On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote:
  So do you have a use case where you think the behavior described in rule 9
  *is* desirable?

 Any application written assuming this behaviour, works correctly on
 Windows, Solaris, *BSD and glibc based systems in general, but not
 on Debian.

So my argument here is that I don't believe there *are* any applications
being written that assume this behavior; and that even if there were, such
applications would either work just fine with the previous getaddrinfo()
behavior, or be too pathological to live.

The goal of RFC3484 is to describe how system resolvers should sort
addresses in order to give applications the best address first.  I think
it's already established in this thread (correct me if I'm wrong) that this
specific rule of sorting by the length of the prefix match does *not*
further the goal of sorting addresses from most to least desirable.
Instead, taken over the whole Internet rule 9 is statistically a
pseudo-randomization relative to the *correct* sorting[1], but with features
that make it an outright pessimization for particular real-world hostnames
due to the set of addresses returned.  I don't believe any sane application
could depend on such pessimization.

One of the existing use cases that breaks is round-robin DNS.  Round-robin
DNS is not an IETF standard; its use has been discouraged by various parties
for years; it has limitations that make it unsuitable for any but the
simplest of configurations.  But none of these are reasons to willfully
degrade currently working setups.  They might be reasons why RR DNS would be
an acceptable sacrifice in favor of other beneficial features, but rule 9 as
written offers *no* benefits in the general case!

Another possibility is a DNS server that intelligently sorts records based
on knowledge of the client, but returns all the addresses instead of
truncating the list.  Arguably it would be more intelligent if the DNS
server didn't return multiple records in this case, but again rule 9 is not
an improvement, so why should it be honored?

 In the bug log, Pierre reported this behaviour is already supported on
 most of those sytems:

 ] On that matter, according to Aurelien, Vista (maybe XP),
 ] {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X
 ] and solaris come to mind). So it's kind of a decision of Debian vs. the
 ] rest of the world. And if I don't really care about the issue of the
 ] decision technically, this aspect worries me.

The purpose of following standards is to foster interoperation between
products of multiple vendors.

Conforming with Section 6, rule 9 of RFC 3484 does *not* improve
interoperation with other vendors.  Instead, it breaks interoperation with
existing Internet infrastructure.

Ignoring rule 9 has no foreseeable negative consequences for Debian client
systems.  Even if this were an IETF standard and all the rest of the
Internet implemented it, the only consequence of Debian ignoring rule 9
would be that Debian systems would continue to play better than everyone
else with those sites that were in the process of transitioning away from
round-robin DNS.

So I don't see that much weight should be given to whether other operating
system vendors choose to comply with a rule which is, fundamentally,
misguided and broken.

Furthermore, even if gethostbyname() has been deprecated in POSIX, it's
relevant that there is still plenty of software in Debian that uses this
interface[1].  Almost all of this software is going to be IPv4-only; if we
want Debian to be fully IPv6-capable, these are programs that will need to
be updated to use the getaddrinfo() interface, at which point they will
cease to work correctly with round-robin DNS in the absence of additional
code to re-randomize addresses(!).  The more work that is needed to make an
IPv4 application function correctly with both IPv4 and IPv6, the less likely
it is that this work will get done; some maintainers may opt not to enable
IPv6 support in their packages rather than use an interface that degrades
behavior under IPv4.

(I wonder how many of the applications in Debian are IPv6-enabled as a
result of local patches, or build options that other vendors aren't
enabling?  Debian has touted its IPv6 support since long before other
vendors considered it relevant; perhaps the other vendors that do follow
rule 9 in their getaddrinfo() implementations would take another look too if
their IPv6 support was more pervasive?)

  Even if you do have one, I still don't see any reason to think this is a
  reasonable default behavior on the real-world Internet.

 As it happens I largely agree 

Re: glibc's getaddrinfo() sort order

2007-09-23 Thread Florian Weimer
* Clint Adams:

 On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote:
 glibc is the only implementation I know of that does this.

 I have heard, though not confirmed first-hand, that modern
 versions of FreeBSD, Windows, and Solaris do as well.

FreeBSD 6.2-RELEASE doesn't do it.  And neither does Fedora (with GNU
libc 2.6.90-15, IPv6 not enabled).  (Windows is an entirely different
matter because the resolver model is completely different.)

You can run the following test program repeatedly to check if every A
record gets its chance.

import socket
print ', '.join(map(lambda x: x[4][0], 
  socket.getaddrinfo('pool.ntp.org', 123, 0, socket.SOCK_DGRAM)))


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-23 Thread Anthony Towns
On Sun, Sep 23, 2007 at 04:21:58AM -0700, Steve Langasek wrote:
 On Fri, Sep 21, 2007 at 01:07:49PM +1000, Anthony Towns wrote:
  On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote:
   So do you have a use case where you think the behavior described in rule 9
   *is* desirable?
  Any application written assuming this behaviour, works correctly on
  Windows, Solaris, *BSD and glibc based systems in general, but not
  on Debian.
 So my argument here is that I don't believe there *are* any applications
 being written that assume this behavior; and that even if there were, such
 applications would either work just fine with the previous getaddrinfo()
 behavior, or be too pathological to live.

There's two aspects to RFC3484's behaviour: first that it creates a
much more stable ordering of its results than could have been expected
otherwise, and second it tries to make that ordering more optimal than
a random ordering would be wrt routing.

Stability is useful for any case where the servers hosting a particular
might be out of sync with each other; eg, if stability could be assumed
we'd have less errors where an invocation of apt-get update chooses one
mirror, and a subsequent apt-get upgrade chooses a different server
that hasn't finished syncing. Hopefully apt-get isn't considered
too pathological to live...

Better routing has less direct benefits to the client, probably limited
to slightly better ping times, with a small chance of somewhat cheaper
bandwidth costs. For the people providing the service, it lets you make
better assumptions as to load balancing -- you can expect the servers
based in a particular area to be serving a load proportional to the
number of users in that area, rather than having the load fairly evenly
distributed globally. Of course, there are other ways of doing this
that don't rely on how the client's resolver is implemented. Of course,
if the routing is worse, those turn into drawbacks instead of benefits.

 Instead, taken over the whole Internet rule 9 is statistically a
 pseudo-randomization relative to the *correct* sorting[1],

If that were the case it would be no worse than round-robin selection of
preferred address.

You can only take it over the whole Internet if you're assuming an equal
distribution across all IPs, which isn't valid for IPv4 (where there's
presumably a significant bias to private IPs), and presumably isn't valid
for any particular service, which will be heavily biassed to particular
IP ranges by correlation with location or language...

 One of the existing use cases that breaks is round-robin DNS.  

Round-robin DNS isn't broken; the expectation of (approximately) equal
load-distribution across all servers in a round-robin is broken.

 They might be reasons why RR DNS would be
 an acceptable sacrifice in favor of other beneficial features, but rule 9 as
 written offers *no* benefits in the general case!

Even without the possibility of applications like apt-get benefiting
from stability of results, I don't think we've done anywhere enough of
a review to be declaring that there aren't any benefits to rule 9.

 So I don't see that much weight should be given to whether other operating
 system vendors choose to comply with a rule which is, fundamentally,
 misguided and broken.

As far as I can see, for rule 9 to be fundamentally misguided and
broken, the concept of providing a stable answer, or a better than random
ordering, would need to be harmful. If they're beneficial, even in some
cases, then we've got a problem in the details of the specification,
not a fundamental issue.

(Note that prefix matching is the only reordering rule that has any
effect in almost all actual cases, so without that rule or a replacement,
both stability and any improvements in routing disappear)

Note that stability isn't definitively a good thing -- if the first
server you connect to happens to be the only one that's down/unreachable,
then with a stable resolver you need to have specific failover code
to use a different address; whereas if you can expect gethostbyname()
to return a different first result, you can just rerun the program.

 Furthermore, even if gethostbyname() has been deprecated in POSIX, it's
 relevant that there is still plenty of software in Debian that uses this
 interface[1].  Almost all of this software is going to be IPv4-only; if we
 want Debian to be fully IPv6-capable, these are programs that will need to
 be updated to use the getaddrinfo() interface, at which point they will
 cease to work correctly with round-robin DNS in the absence of additional
 code to re-randomize addresses(!).  

Uh, round-robin DNS isn't a guarantee that any individual client will
get different or randomised results -- and the argument that round-robin
won't break anything that relies on rule 9 goes the other way too.

Further, having getaddrinfo() behave differently for IPv4 and IPv6
isn't completely helpful in making Debian support IPv6 -- if we change
a program 

Re: glibc's getaddrinfo() sort order

2007-09-22 Thread Florian Weimer
* Anthony Towns:

 I don't agree with making a decision to go against an IETF standard

RFC 3484 is not an IETF standard.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-21 Thread Ian Jackson
Anthony Towns writes (Re: glibc's getaddrinfo() sort order):
 On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote:
  So do you have a use case where you think the behavior described in rule 9
  *is* desirable?
 
 Any application written assuming this behaviour, works correctly on
 Windows, Solaris, *BSD and glibc based systems in general, but not
 on Debian.

You're completely missing the point.

Applications are NOT written assuming this behaviour.

Applications are written assuming the behaviour of gethostbyname and
then later the call to gethostbyname is replaced by getaddrinfo when
the application is upgraded to support IPv6.


Let us take a concrete example: http over tcp as implemented by curl.

Originally, curl would call gethostbyname.  gethostbyname would get
its answers from the DNS, with the usual round robin.  curl would then
connect to the first address in the list.

Across the population of calling clients, curl would (just as with
other applications such as web browsers) pick the one of the available
addresses with roughly equal probability.

So that is what the DNS administrator for the site intends by the
publication of multiple address records.

Now, curl is changed to support IPv6.  These are not very intrusive
changes but one of them involves a pretty much direct replacement of
gethostbyname with getaddrinfo.

After being changed in this way curl will sort the addresses according
to rule 9.  This means for each client the address is always the same,
and which one depends on the client's idea of its own address.  Since
clients are nowhere near uniformly distributed in the address space,
this will direct the traffic quite non-uniformly.

This is not what the DNS administrator had intended and represents a
change to the behaviour.


So to recap the three possibilites I mentioned were:

] (a) It is correct that the behaviour of applications (and hence of
] hosts) should be changed to comply with rule 9.

Ie the DNS administrator was wrong, even if these DNS records were
published before the change was made or before RFC3484 was written.
I assume you're not proposing this.

] (b) Application behaviour should not change; getaddrinfo should
] behave the same way as gethostbyname.

This seems obviously correct to me.

] (c) Application behaviour should not change but getaddrinfo should
] comply with rule 9.  Applications should therefore not be changed
] to use getaddrinfo instead of gethostbyname.

And yours, which seems like a version of (c) to me:

] (d) Applications should use getaddrinfo(), and if the ordering behaviour
] it uses is not desired, they should use an ordering that is desired.

Is the ordering behaviour desired ?  Obviously not.

So you seem to be suggesting that the direct replacement of
gethostbyname with getaddrinfo is wrong in this case.

So how should curl be changed to use the desired (DNS round robin, or
equivalent) ordering ?

What is special about curl ?  I could replace curl with almost any
other application in the argument above and come to the same
conclusions.


Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-21 Thread Ian Jackson
Anthony Towns writes (Re: glibc's getaddrinfo() sort order):
 As it happens I largely agree with that. I don't agree with making a
 decision to go against an IETF standard and glibc upstream lightly,
 though, no matter how many caps Ian expends repeating that it's at the
 least mature level of Internet standard.


Firstly: the STANDARD BEHAVIOUR FOR IPV4 IS THAT IMPLEMENTED BY
GETHOSTBYNAME.  I wonder how familiar you are with Internet protocol
standardisation, and the IETF ?  The purely document-oriented and de
jure approach your taking doesn't seem to match actual Internet
practice very well. Internet standards are living documents describing
an evolving network.  It is well known that if you read the RFCs as
your only source of guideance for implementation you will go badly
wrong.  Making reference to an RFC which contradicts long-established
existing behaviour is rather beside the point.

Secondly: RFC3484 mandates that all applications should change, even
those using gethostbyname.  (You have completely ignored this point.)

Thirdly: I'm not saying we should make this decision lightly.  Saying
we shouldn't go against ... lightly is just weasel-words.  Is this
discussion [going] against ... lightly ?  No, of course not.  What
that argument would really be if you had any confidence in it would be
shouldn't go against ... at all - but of course that's absurd.


I would like to expand on this point about standards.

Slavish adherence to standards, or to the views of mistaken upstreams,
is a generally a mistake.  This is particularly the case for the
Debian Technical Committee.

The TC's job is to decide what the correct behaviour is, by
considering the technical merits.  The TC's job is not to interpret
standards documents.  (Indeed, within our jurisdiction, our job
includes changing them if we disagree with them.)

Obviously we need to use standards documents to help understand the
behaviour of the actual computing systems, to understand what is
expected of our systems and what responses other systems are likely to
produce.  If we find ourself in clear disagreement with a standard we
ought to ask ourselves whether we're sure we really understand the
situation fully.

As the implementor of a DNS resolver library, a past IETF participant,
a DNS administrator, and someone who's followed some of the IPv6
transition work, I'm convinced I have that understanding.

If you feel you don't have that understanding them please ask the
questions which would help you gain it.  I think we should be able to
answer them.


Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-20 Thread Steve Langasek
On Wed, Sep 19, 2007 at 05:08:25AM +1000, Anthony Towns wrote:
 On Tue, Sep 18, 2007 at 07:18:40PM +0100, Ian Jackson wrote:
  There are only three possibilities:
  (a) It is correct that the behaviour of applications (and hence of
  hosts) should be changed to comply with rule 9.
  (b) Application behaviour should not change; getaddrinfo should
  behave the same way as gethostbyname.
  (c) Application behaviour should not change but getaddrinfo should
  comply with rule 9.  Applications should therefore not be changed
  to use getaddrinfo instead of gethostbyname.

 No, there aren't. A fourth possibility is:

   (d) Applications should use getaddrinfo(), and if the ordering behaviour
   it uses is not desired, they should use an ordering that is desired.

The ordering rules affect all DNS queries, and the topography of the IPv4
Internet is such that we know this ordering is going to be a wash in the
general case, give pessimal behavior in a subset of cases, and break the
utility of round-robin DNS in the majority of cases where the nodes aren't
all hosted in the same IP assignment.

So do you have a use case where you think the behavior described in rule 9
*is* desirable?

Even if you do have one, I still don't see any reason to think this is a
reasonable default behavior on the real-world Internet.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
[EMAIL PROTECTED]   http://www.debian.org/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-20 Thread Anthony Towns
On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote:
 So do you have a use case where you think the behavior described in rule 9
 *is* desirable?

Any application written assuming this behaviour, works correctly on
Windows, Solaris, *BSD and glibc based systems in general, but not
on Debian.

In the bug log, Pierre reported this behaviour is already supported on
most of those sytems:

] On that matter, according to Aurelien, Vista (maybe XP),
] {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X
] and solaris come to mind). So it's kind of a decision of Debian vs. the
] rest of the world. And if I don't really care about the issue of the
] decision technically, this aspect worries me.

Hrm, I see RFC5014 (from this month) provides some socket options for
changing the way RFC3484 source address selection works, and envisages
the possibility of doing the same for destination address selection. It
assumes prefix matching is undertaken in getaddrinfo in order to achieve
one of its aims.

 Even if you do have one, I still don't see any reason to think this is a
 reasonable default behavior on the real-world Internet.

As it happens I largely agree with that. I don't agree with making a
decision to go against an IETF standard and glibc upstream lightly,
though, no matter how many caps Ian expends repeating that it's at the
least mature level of Internet standard. If it's also the case that
the RFC-specified behaviour is a de facto standard amongst other OSes,
as the above seems to indicate, then that's even more reason to make
sure we have a clear decision backed up by good, clear reasoning.

Cheers,
aj



signature.asc
Description: Digital signature


Re: glibc's getaddrinfo() sort order

2007-09-19 Thread Clint Adams
On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote:
 glibc is the only implementation I know of that does this.

I have heard, though not confirmed first-hand, that modern
versions of FreeBSD, Windows, and Solaris do as well.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Ian Jackson
Anthony Towns writes (Re: glibc's getaddrinfo() sort order):
 I'm not familiar with how getaddrinfo() has been implemented in the
 past

I think this is an important point.  If you're not familiar with the
history then perhaps I can help explain.


hostname-to-address lookups have up to recently generally been done
with gethostbyname.

The addresses from gethostbyname are ordered as they were returned by
the nameserver (unless special configuration is made locally to
override this, which is rarely done).  When multiple addresses are
available for a single lookup, special code in all widely-deployed
nameservers arranges to rotate or round-robin the returned
addresses: each enquirer gets a new ordering.  This is so that a
single service name can be made to refer to a number of physical
network interfaces (perhaps on different hosts) and the load shared
across them.  This is known as DNS based load balancing.  If the
protocol is one like mail, where callers can be expected to try
multiple addreses if the first doesn't work, this gives you a failover
as well.

So far so good.  (For clarify, it is the above round-robin
functionality that I am arguing ought to be preserved.)


gethostbyname can theoretically support IPv6 but it can only return
one address type per call.  While there is a way to embed an IPv4
address in an IPv6 address, for circumstances like these, there is no
clear way to tell gethostbyname that the calling application (and the
rest of the stack on which the application relies) will cope with
getting a pile of AF_INET6 back rather than AF_INET.

Therefore for IPv6, a new interface was needed.  The interface
(defined in RFC3493 s6.1 and its predecessors) is getaddrinfo.  It has
several new features most of which aren't relevant here.  The
critical new feature is this:

getaddrinfo allows the application to specify whether it wants to get
only IPv4 addresses or IPv6 addresses as well, and if getting mixed
addresses, whether to encode then as AF_INET or as `v6-mapped'
AF_INET6 (ie, the 32 bits of IPv4 address padded with a specific
prefix to make up an IPv6 address, where the prefix means no actually
this is not an IPv6 address but an IPv4 address and should be used
with IPv4).

Combined with various other new facilities, this makes it reasonably
straightforward to convert an IPv4-only application to be
IPv6-capable.

So, in summary: getaddrinfo is intended to replace gethostbyname.


However, additionally, it was realised that if getaddrinfo can return
a mixture of IPv4 and v6 addresses it was necessary to specify in what
order they ought to be returned.

When RFC3484 was written its authors evidently felt that the best way
to do this was to define a comparison function over all addresses,
which would define which address was to be preferred.

Heedless of the effect on the DNS round-robin functionality I describe
above, the authors of RFC3484 specified (s6 rule 9) that all addresses
should be sorted by proximity to the host making the choice - where
proximity is defined as the length of the common initial address
prefix.

This may have been a disputed but arguable definition of real network
proximity for IPv6 in at the time 3484 was written.  But it is clear
now that it is not such a measure in the real IPv6 internet, and it
has never been such a measure in the IPv4 internet.

So RFC3484 s6 rule 9 is just wrong, because the reasons behind it do
not apply any more if they ever did.


However, it's worse than that: rule 9 is trying to change the
behaviour of existing systems.  If we agree with rule 9 it ought to
apply just as well to applications using gethostbyname.

All existing applications using gethostbyname are not in compliance
with rule 9.  It would perhaps be possible to modify gethostbyname to
sort addresses according to RFC3484 s5 and s6.

But would it be a good idea ?  No, obviously not.  It would change the
behaviour of all of the applications which currently use
gethostbyname.

Currently such applications pick addresses at random (according to
the DNS round robin).  Rule 9 would have applications pick them
according to longest-common-prefix.  This would destroy the DNS based
load balancing arrangements.


What about getaddrinfo ?  Well, there is no reason why a change in API
(to add additional richness needed for new functionality) should so
radically change the behaviour.

And indeed, we see that indeed the DNS load balancing of our own
servers has been broken by this change !

That is, applications are changed from using non-rule-9 gethostbyname
to rule-9 getaddrinfo, and the servers experience wildly unbalanced
load and break.



 The RFC tries to make getaddrinfo return a predictable ordering in the
 face of random orderings from DNS. That seems a perfectly reasonable
 way to define a function in the abstract; though certainly the ordering
 it comes up with can be criticised.

It is not reasonable for the RFC to attempt to specify that the
addresses be returned in a predictable

Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Anthony Towns
On Tue, Sep 18, 2007 at 03:33:51PM +0100, Ian Jackson wrote:
 Anthony Towns writes (Re: glibc's getaddrinfo() sort order):
  I'm not familiar with how getaddrinfo() has been implemented in the
  past
 I think this is an important point.  If you're not familiar with the
 history then perhaps I can help explain.
 hostname-to-address lookups have up to recently generally been done
 with gethostbyname.

Right, gethostbyname I am familiar with (along with the corresponding
DNS round-robin behaviour), and changing its behaviour is certainly
unreasonable.

 [...]

 So far so good.  (For clarify, it is the above round-robin
 functionality that I am arguing ought to be preserved.)

 [...]

 However, additionally, it was realised that if getaddrinfo can return
 a mixture of IPv4 and v6 addresses it was necessary to specify in what
 order they ought to be returned.
 
 When RFC3484 was written its authors evidently felt that the best way
 to do this was to define a comparison function over all addresses,
 which would define which address was to be preferred.
 
 Heedless of the effect on the DNS round-robin functionality I describe
 above, the authors of RFC3484 specified (s6 rule 9) that all addresses
 should be sorted by proximity to the host making the choice - where
 proximity is defined as the length of the common initial address
 prefix.

So if getaddrinfo() has always behaved in this way, I don't see a great
deal of justification in changing it. The bug log indicated that there
were pre-rfc implementations of getaddrinfo() that behaved more like
gethostbyname() at least wrt round-robin DNS; but I've got no way of
verifying that.

 This may have been a disputed but arguable definition of real network
 proximity for IPv6 in at the time 3484 was written.  But it is clear
 now that it is not such a measure in the real IPv6 internet, and it
 has never been such a measure in the IPv4 internet.

I hadn't seen any indication it was disputed for IPv6 prior to your mail.
The patch in glibc only affected IPv4 addresses, for that matter.

 So RFC3484 s6 rule 9 is just wrong, because the reasons behind it do
 not apply any more if they ever did.

To give an analogy to the lines I'm thinking along: the definition of
tm_year in the tm struct in time.h is wrong, years since 1900 should be
years since 0 AD, but the spec says otherwise, so programs simply need
to deal with that historical craziness.

That's not quite the same here, in that the spec does (by my reading)
explicitly allow implementors to not behave in that way, but if you're
coding to the spec you certainly can't rely on DNS round-robin being passed
through an invocation of getaddrinfo().

 However, it's worse than that: rule 9 is trying to change the
 behaviour of existing systems.  If we agree with rule 9 it ought to
 apply just as well to applications using gethostbyname.

 All existing applications using gethostbyname are not in compliance
 with rule 9.  

The RFC specifies the behaviour of getaddrinfo(), not gethostbyname(),
so doesn't affect any apps that solely use gethostbyname(). So no, it
shouldn't be applied to other functions anymore than the definition of
tm_year should mean we count from 1900 in every year related function.

I think we can safely say that Rule 9 isn't useful for IPv4 addresses.
I'm not sure that's true or not for IPv6 addresses -- it certainly seems
an inappropriately hierarchial way of viewing a network that's connected
much more ... fluidly than that, at any rate. But even if Rule 9 is
completely useless and counterproductive, it's still the standard for
that function, which, afaics, we should be meeting.

 What about getaddrinfo ?  Well, there is no reason why a change in API
 (to add additional richness needed for new functionality) should so
 radically change the behaviour.

Agreed in principle, but this is a rule the RFC should've followed;
since they haven't, I'm not convinced we should.

 It is not reasonable for the RFC to attempt to specify that the
 addresses be returned in a predictable ordering when the established
 behaviour, relied on throughout the internet for decades, has been
 that the addresses are _not_ returned in a predictable order.

Again, I agree with that, but the RFC *has* done that.

  I'd say it's more important that getaddrinfo() on Debian behave the same
  as on other operating systems, than that it behave in the same way as
  other functions. I can only take the RFC's assertion as to getaddrinfo()'s
  proper behaviour though; I don't have a more direct idea how getaddrinfo()
  behaves in previous versions of Debian, other Linux distros, other libcs,
  Windows, etc.
 This argument is an argument for accepting any crap that comes out of
 glibc upstream.

No, it's an argument for accepting any crap that comes out of the Internet
standards process. :-/

 As I have demonstrated above, the RFC is wrong, inconsistent with
 existing practice, 

It's certainly inconsistent with gethostbyname()'s existing

Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Kurt Roeckx
On Wed, Sep 19, 2007 at 03:03:51AM +1000, Anthony Towns wrote:
  
  Heedless of the effect on the DNS round-robin functionality I describe
  above, the authors of RFC3484 specified (s6 rule 9) that all addresses
  should be sorted by proximity to the host making the choice - where
  proximity is defined as the length of the common initial address
  prefix.
 
 So if getaddrinfo() has always behaved in this way, I don't see a great
 deal of justification in changing it. The bug log indicated that there
 were pre-rfc implementations of getaddrinfo() that behaved more like
 gethostbyname() at least wrt round-robin DNS; but I've got no way of
 verifying that.

glibc is the only implementation I know of that does this.

I've attached a small test program.  The results are:
sarge: libc6 2.3.2.ds1-22sarge5: random order
etch: libc6 2.3.6.ds1-13etch2: ordered results

On other implementations I'm aware of is in libbind.  You'll need to run
configure with the --enable-libbind for that.  It doesn't reorder it.

I don't know of any of the other libcs in debian actually provide
getaddrinfo(), but I doubt they'll reorder.

There are also lots of applications that have a wrapper around
gethostbyname() in case the libc doesn't provide it.  It's highly
unlikely any of those will do any reordering.

  This may have been a disputed but arguable definition of real network
  proximity for IPv6 in at the time 3484 was written.  But it is clear
  now that it is not such a measure in the real IPv6 internet, and it
  has never been such a measure in the IPv4 internet.
 
 I hadn't seen any indication it was disputed for IPv6 prior to your mail.
 The patch in glibc only affected IPv4 addresses, for that matter.

I've also stated that it might not work properly for IPv6.  It's
likely that something in the same /32 is close network wise, it's
even more likely for /48 and /64, but you probably don't want to go
below the /32.


Kurt


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Kurt Roeckx
On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote:
 I've attached a small test program.  The results are:
 sarge: libc6 2.3.2.ds1-22sarge5: random order
 etch: libc6 2.3.6.ds1-13etch2: ordered results

Maybe I should attach it.


Kurt

#include sys/types.h
#include sys/socket.h
#include arpa/inet.h
#include netdb.h
#include stdio.h

int main()
{
	struct addrinfo *res, *p, hints;

	hints.ai_flags = 0;
	hints.ai_family = PF_UNSPEC;
	hints.ai_socktype = SOCK_DGRAM;
	hints.ai_protocol = 0;
	hints.ai_addrlen = 0;
	hints.ai_addr = NULL;
	hints.ai_canonname = NULL;
	hints.ai_next = NULL;

	getaddrinfo(0.pool.ntp.org, ntp, hints, res);

	for (p = res; p; p = p-ai_next)
	{
		if (p-ai_family == AF_INET)
		{
			char ip[INET_ADDRSTRLEN];
			if (inet_ntop(p-ai_family,
(*(struct sockaddr_in *)p-ai_addr).sin_addr,
ip, sizeof(ip)) != NULL)
			{
printf(%s\n, ip);
			}
		}
	}
	freeaddrinfo(res);
	return 0;
}



Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Ian Jackson
Anthony Towns writes (Re: glibc's getaddrinfo() sort order):
 So if getaddrinfo() has always behaved in this way, I don't see a great
 deal of justification in changing it. The bug log indicated that there
 were pre-rfc implementations of getaddrinfo() that behaved more like
 gethostbyname() at least wrt round-robin DNS; but I've got no way of
 verifying that.

I don't know whether or not there were previous versions of
getaddrinfo with the same behaviour as gethostbyname, but that is the
wrong way of looking at it.  getaddrinfo wasn't in widespread use
until the recent efforts to support IPv6.

Did you miss the bits where I said that
 * getaddrinfo is supposed to replace gethostbyname
 * applications are being changed t call getaddrinfo instead of
   gethostbyname
?

There are only three possibilities:

(a) It is correct that the behaviour of applications (and hence of
hosts) should be changed to comply with rule 9.
(b) Application behaviour should not change; getaddrinfo should
behave the same way as gethostbyname.
(c) Application behaviour should not change but getaddrinfo should
comply with rule 9.  Applications should therefore not be changed
to use getaddrinfo instead of gethostbyname.

Which of these are you proposing ?  RFC3484 says (a) but is wrong for
the reasons I have explained.  (b) is my view.   (c) is obviously
unreasonable.

Anthony Towns writes (Re: glibc's getaddrinfo() sort order):
  All existing applications using gethostbyname are not in compliance
  with rule 9.  
 
 The RFC specifies the behaviour of getaddrinfo(), not gethostbyname(),

Nonsense.  It doesn't specify the behaviour of any such API at all.

RFCs like this one specify the behaviour of _hosts_.  That is, it
specifies what kind of packets the host should emit and accept, on
what interfaces.

There is nothing in RFC3484 that limits its application to getaddrinfo
rather than gethostbyname.

There is discussion in s8 which suggests some possible behaviours of
getaddrinfo as an `implementation strategy' for RFC3484 - but note
that our getaddrinfo doesn't do what s8 suggests (because s8 is
barking mad).  If you agree with RFC3484 s8 then you ought to conclude
that similar changes ought to be made to other internal interfaces
which do the same job as getaddrinfo.

 so doesn't affect any apps that solely use gethostbyname(). So no, it
 shouldn't be applied to other functions anymore than the definition of
 tm_year should mean we count from 1900 in every year related function.

This business about tm_year is a complete red herring.

In fact, you've got my argument completely backwards.

If someone wrote in a standards document that tm_year should be zero
at 0AD (whatever that means) rather than 1900AD, what should we do ?

Well, the answer would be obvious: we should continue to do what we
have done forever, so as not to change the meaning of existing
infrastructure: zero at 1900AD.

This is what RFC3484 s6 is doing.  It is trying to change the meaning
of existing deployments of multiple IPv4 addresses in the global DNS.

 I think we can safely say that Rule 9 isn't useful for IPv4 addresses.

Are you happy then that we should mandate that the Debian libc
maintainer should change our libc accordingly ?

 I'm not sure that's true or not for IPv6 addresses -- it certainly seems
 an inappropriately hierarchial way of viewing a network that's connected
 much more ... fluidly than that, at any rate. But even if Rule 9 is
 completely useless and counterproductive, it's still the standard for
 that function, which, afaics, we should be meeting.

It is NOT THE STANDARD as I have previously pointed out.

An IETF working group proposed that it ought to become the standard
but 1. the standard has not advanced further 2. that was in a time
when IPv6 addressing structure was understood very differently.

To justify my point 2, that RFC3484 predates substantial changes in
the IPv6 addressing architecture:

Site-local addresses are one of the key features that motivates the
rules in RFC3484.  These were deprecated by RFC3879 (status: PROPOSED)
and this was confirmed in RFC4291 (status: DRAFT).

(The standards track goes PROPOSED - DRAFT - STANDARD.)

DNS for IPv6 was originally intended to be supported with A6, DNAME
and bitstring labels according to RFC2874.  This was originally
Standards Track and was designed to support rapid and continuous
renumbering.  With the publication of RFC3363 (s1.1) and supported by
the arguments in RFC3364, 2874 was moved to EXPERIMENTAL (ie, off the
Standards Track), because rapid and continuous renumbering is no
longer planned.

Ie, the addressing and numbering arrangements for IPv6 have changed
significantly since 3484 was written.  That could well be why 3484
hasn't progressed.

  What about getaddrinfo ?  Well, there is no reason why a change in API
  (to add additional richness needed for new functionality) should so
  radically change the behaviour.
 
 Agreed in principle, but this is a rule

Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Anthony Towns
On Tue, Sep 18, 2007 at 07:18:40PM +0100, Ian Jackson wrote:
 There are only three possibilities:
 (a) It is correct that the behaviour of applications (and hence of
 hosts) should be changed to comply with rule 9.
 (b) Application behaviour should not change; getaddrinfo should
 behave the same way as gethostbyname.
 (c) Application behaviour should not change but getaddrinfo should
 comply with rule 9.  Applications should therefore not be changed
 to use getaddrinfo instead of gethostbyname.

No, there aren't. A fourth possibility is:

  (d) Applications should use getaddrinfo(), and if the ordering behaviour
  it uses is not desired, they should use an ordering that is desired.

Since we're at the point where you're yelling at me about how I'm not
listening, I won't reply further.

Cheers,
aj



signature.asc
Description: Digital signature


Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Anthony Towns
On Tue, Sep 18, 2007 at 08:41:45PM +0200, Kurt Roeckx wrote:
 On Wed, Sep 19, 2007 at 03:03:51AM +1000, Anthony Towns wrote:
  So if getaddrinfo() has always behaved in this way, I don't see a great
  deal of justification in changing it. [...]
 glibc is the only implementation I know of that does this.

Windows implementations would seem like the other candidate, given the
Microsoft Research at the top of that RFC.

Cheers,
aj



signature.asc
Description: Digital signature


Re: glibc's getaddrinfo() sort order

2007-09-18 Thread Andreas Barth
* Ian Jackson ([EMAIL PROTECTED]) [070918 16:35]:
 So RFC3484 s6 rule 9 is just wrong, because the reasons behind it do
 not apply any more if they ever did.

I have some stanza from the dns-operations list:
http://lists.oarci.net/pipermail/dns-operations/2007-September/002028.html
| Either it [RFC3484] should be corrected or declared Historic.


Denic is the german domainnames authority, and they usually know what
they do (especially Peter Koch does).


Cheers,
Andi
-- 
  http://home.arcor.de/andreas-barth/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-13 Thread Pierre Habouzit
On Thu, Sep 13, 2007 at 12:14:09AM +, Anthony Towns wrote:
 On Thu, Sep 13, 2007 at 12:06:40AM +0100, Ian Jackson wrote:
  Does anyone have an answer to my point that application of rule 9
  changes the long-established meaning of existing DNS data ?
 
 I'm not familiar with how getaddrinfo() has been implemented in the
 past -- but I think it makes more sense to look at the definition of
 the function than the data it's manipulating.
 
 The RFC tries to make getaddrinfo return a predictable ordering in the
 face of random orderings from DNS. That seems a perfectly reasonable
 way to define a function in the abstract; though certainly the ordering
 it comes up with can be criticised.
 
  I disagree with your answer to that first question.  gethostbyname
  returns results in random order.  getaddrinfo should do the same.
 
 I'd say it's more important that getaddrinfo() on Debian behave the same
 as on other operating systems, than that it behave in the same way as
 other functions. I can only take the RFC's assertion as to getaddrinfo()'s
 proper behaviour though; I don't have a more direct idea how getaddrinfo()
 behaves in previous versions of Debian, other Linux distros, other libcs,
 Windows, etc.

  Our tests shows that windows XP since SP1 (or 2 ?), vista, various
recent BSD, and now glibc 2.6 (or 2.5 I don't remember when it was
introduced) all behave this way. I've no access to macos X, but I
wouldn't be surprised it works the same.  Another interesting hint would
be to test on solaris too.


-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpY7UGvlzQJV.pgp
Description: PGP signature


Re: glibc's getaddrinfo() sort order

2007-09-12 Thread Ian Jackson
Anthony Towns writes (Re: glibc's getaddrinfo() sort order):
 On Fri, Sep 07, 2007 at 01:06:06AM +0200, Kurt Roeckx wrote:
  It's atleast in the spirit of the rfc to prefer one that's on the local
  network.  It might be the intention of rule 9, but then rule 9 isn't
  very well written.
 
 Rule 9 seems perfectly well written, it just does something you
 (reasonably) consider undesirable.

Should I take that as agreement with Steve's and my view, that we
should by default not apply rule 9 to IPv4 ?  Your opinion seems
unclear to me.

We haven't heard from the rest of the committee.

Does anyone have an answer to my point that application of rule 9
changes the long-established meaning of existing DNS data ?  (In ways,
I would add, which have proven to cause significant operational
problems in practice.)  As I say, I think that point is unanswerable
and leads inevitably to the conclusion that we should disable this
behaviour by default.


The rest of your (AJ's) mail seems to be getting bogged down a bit.
I'll try to answer what I see as the key aspects.

 In addition, I think there's two different aspects here: the first is
 should getaddrinfo() return results in random order to aid in load
 distribution? and the second is is prefix matching a reasonable way
 to determine a good host to use?

I disagree with your answer to that first question.  gethostbyname
returns results in random order.  getaddrinfo should do the same.
(random isn't quite true but it's true enough in the usual case.)

 AFAICS, the answer to the first question is simply no, it shouldn't --
 randomised load balancing like that needs to be done at the application
 level,

You are mistaken.  Randomised load balancing like that is _already
done_ using multiple IPv4 addresses in the DNS.  It has been done this
way for nearly two decades.

 [stuff]
 Doing it by changing Rule 9 to:

I don't think this kind of complexity is warranted here.  Even if it
were, you seem to be proposing a strategy which depends on guessing
whether communication with a particular destination address would
involve NAT, which would be fragile.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-12 Thread Anthony Towns
On Thu, Sep 13, 2007 at 12:06:40AM +0100, Ian Jackson wrote:
 Does anyone have an answer to my point that application of rule 9
 changes the long-established meaning of existing DNS data ?

I'm not familiar with how getaddrinfo() has been implemented in the
past -- but I think it makes more sense to look at the definition of
the function than the data it's manipulating.

The RFC tries to make getaddrinfo return a predictable ordering in the
face of random orderings from DNS. That seems a perfectly reasonable
way to define a function in the abstract; though certainly the ordering
it comes up with can be criticised.

 I disagree with your answer to that first question.  gethostbyname
 returns results in random order.  getaddrinfo should do the same.

I'd say it's more important that getaddrinfo() on Debian behave the same
as on other operating systems, than that it behave in the same way as
other functions. I can only take the RFC's assertion as to getaddrinfo()'s
proper behaviour though; I don't have a more direct idea how getaddrinfo()
behaves in previous versions of Debian, other Linux distros, other libcs,
Windows, etc.

  AFAICS, the answer to the first question is simply no, it shouldn't --
  randomised load balancing like that needs to be done at the application
  level,
 You are mistaken.  [...]

What getaddrinfo() should and shouldn't do is defined by the standard,
not by what would be most useful. :-/

FWIW, if the standard should be changed, it seems to me that it'd carry
more weight having the Debian tech ctte put that recommendation in than
a random DD.

Cheers,
aj



signature.asc
Description: Digital signature


Re: glibc's getaddrinfo() sort order

2007-09-09 Thread Steve Langasek
I concur with all of Ian's comments, and in particular I would also like to
encourage Kurt to champion this issue to the IETF working group.  My own
past experiences suggest that glibc upstream is willing to hide behind
standards not only when they mandate undesirable behavior but also when they
fail to /prohibit/ undesirable behavior, so it would be nice to have a
solution that in the long term doesn't require the Debian glibc maintainers
to diverge from upstream in order to comply with a ruling of the TC.

I would also underscore the additional reason Kurt has pointed out for why
RFC3484 section 6 rule 9 is inappropriate for IPv4 networks, even in the
absence of NAT.  Over the years, the IPv4 address space has become extremely
fragmented, in large part due to an incomplete understanding of the
long-term significance of early stewardship policies.  As an example, by
2003 the ISP I was working for had network allocations in each of 206.x.x.x,
208.x.x.x, and 64.x.x.x, and have since picked up netblocks in 216.x.x.x and
63.x.x.x.  While some of these netblocks do share common prefixes, the
common prefixes are so short that they aren't even specific to North
America[1], and some of the netblocks are far enough apart that rule 9 would
give precedence to half the planet over the router down the hall.

Rule 9 follows naturally from IPv6 allocation policies which have been
crafted in direct response to the experiences with IPv4 with the intent of
minimizing address space fragmentation.  In IPv6, 64 bits of the address are
host bits, and another 16 bits of the prefix denote local networks, with
the remaining 48 bits corresponding fairly well with network topology.  This
rule is therefore a sensible default for IPv6, but for IPv4 it easily
results in pessimal behavior and should not be a default.

Cheers,
-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
[EMAIL PROTECTED]   http://www.debian.org/

[1] http://xkcd.com/195/ :-)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Ian Jackson
Kurt Roeckx writes (Re: glibc's getaddrinfo() sort order):
 It's atleast in the spirit of the rfc to prefer one that's on the local
 network.  It might be the intention of rule 9, but then rule 9 isn't
 very well written.

I agree that applying RFC3484 section 6 rule 9 to IPv4 addresses is a
mistake and that therefore we should change the default in Debian
accordingly.  I would encourage Kurt to take this matter up with the
relevant IETF working group.


Others have already written about problems involving NAT.  I agree
with this argument (although I don't approve of NAT and it galls me to
use some braindamage involving NAT as an argument for anything).

However there is another argument I would like to make:

A host using getaddrinfo configured to apply rule 9 to IPv4 addresses
will behave quite differently to a host using gethostbyname.  I think
that this change in behaviour is unwarranted.  Whether an application
uses gethostbyname or getaddrinfo is an implementation detail (related
closely to whether that particular application's source code has been
modified to try to support IPv6) and this should not change the
behaviour.

Presently when connecting to a service offering only IPv4 addresses,
most hosts will use gethostbyname and use the addresses offered in
round-robin DNS order.  That is to say, the meaning (pre-RFC3484, and
current de-facto) of a DNS RRset containing several IP addresses is
that the addresses should be tried `uniformly at random' by callers,
as done by the nameserver round-robin RRset rotation algorithm.

RFC3484 section 6 rule 9 applied to IPv4 appears to be an attempt to
change that meaning.  This interpretation of rule 9 for IPv4 as an
attempt to change the meaning of existing deployed DNS RRsets is
supported by the fact that proponents of rule 9 for IPv4 claim that it
will fix existing problems, as in
http://udrepper.livejournal.com/16116.html.

However, it is obviously wrongheaded to attempt to change the defined
meaning of all existing multi-record A RRsets.  On the existing
Internet, zone administrators use multi-record A RRsets in the
knowledge that those RRsets will be used by callers in an
evenly-distributed round-robin fashion as currently implemented by
bind and gethostbyname.

This meaning for multiple A records had been established for well over
a decade by the time 3848 was written and in the intervening years it
has continued to be dominant.  New systems, and systems newly modified
to support IPv6, should continue to interpret existing A RRsets in the
same way as before.

A few cursory web searches show that this new behaviour of getaddrinfo
is indeed causing trouble as applications are converted to IPv6 and
the change in behaviour with IPv4 is found to be undesirable.


Finally, I would like to preemptively address the line but this is an
RFC and we must do what it says.  There are two responses:

The most obvious one is that RFC3484 is merely Proposed Standard.  At
this stage of the standardisation process one can expect to find
errors, mistaken deviations from existing practice, and so on.
(The IETF standardisation process has been broken so that documents
often get stuck in this state; but that doesn't mean that we should
treat draft documents as if they were gospel, let alone documents that
aren't even drafts.)

The second is a more general point: if a standards document tells us
to do something which is wrong, then we should not do it.  Obviously
we should think fairly hard before making the decision to go against a
standard, but our job is to do the right thing and standards documents
are there to help us not to constrain us.  I think my argument above
about the existing meaning of multiple A records is irrefutable.


 I already suggested that maybe rule 9 should be limited to the common
 prefix length of the netmask you're using.  An other option is that you
 extend rule 2 to have the same behaviour with ipv4, and that 10/8,
 172.16/12 and 192.168/16 should be considered organization-local.

Replacing rule 9 with something more limited based on local network
interfaces (ie, prefer what appear to be locally-attached addresses)
would be fine.  Or a default based on routing metrics would be fine
too.  (Although I think these may be too much work to do in
getaddrinfo.)

The problem occurs when we start ranking IPv4 addresses of foreign
systems about we have no special knowledge of the topology.

Ranking RFC1918 addresses ahead of others is not entirely a safe thing
to do because people sometimes foolishly publish RFC1918 addresses for
public services and expect callers to skip those addresses somehow.
But at least it wouldn't break people who weren't already doing wrong
things.


Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Pierre Habouzit
On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote:
 On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote:
  Pierre Habouzit wrote:
   Also note that probably many many Windows machines work that way (the
 RFC was written by a MS guy). And this behaviour impacts software
 developpers, and people that hoped that having multiple A records for
 their service will see a perfect round robin will be stuck anyways. I
 mean, it's non previous-practice-backward-compliant and one can argue
 reasonably it sucks. But hel-llooo ! this kind of design choice is not
 only local. If every one (or the majority) on the internet behaves like
 this, fixing this bug (if it is really one) in Debian will _not_, I
 say _not_ prevent us from fixing many software that rely on DNS round
 robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will
 have to cope with that whatever choice is made.

  On that matter, according to Aurélien, Vista (maybe XP),
{Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X
and solaris come to mind). So it's kind of a decision of Debian vs. the
rest of the world. And if I don't really care about the issue of the
decision technically, this aspect worries me.

-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpCalQfmLsOJ.pgp
Description: PGP signature


Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Anthony Towns
On Fri, Sep 07, 2007 at 01:06:06AM +0200, Kurt Roeckx wrote:
 It's atleast in the spirit of the rfc to prefer one that's on the local
 network.  It might be the intention of rule 9, but then rule 9 isn't
 very well written.

Rule 9 seems perfectly well written, it just does something you
(reasonably) consider undesirable.

The RFC says:

]   Rule 9:  Use longest matching prefix.
]   When DA and DB belong to the same address family (both are IPv6 or
]   both are IPv4): If CommonPrefixLen(DA, Source(DA)) 
]   CommonPrefixLen(DB, Source(DB)), then prefer DA.  Similarly, if
]   CommonPrefixLen(DA, Source(DA))  CommonPrefixLen(DB, Source(DB)),
]   then prefer DB.
]
]   Rule 10:  Otherwise, leave the order unchanged.
]   If DA preceded DB in the original list, prefer DA.  Otherwise prefer
]   DB.
]
]   Rules 9 and 10 may be superseded if the implementation has other
]   means of sorting destination addresses.  For example, if the
]   implementation somehow knows which destination addresses will result
]   in the best communications performance.

The admin says that rule 9 isn't appropriate seems to fit somehow
knows which destination address will result in the best communications
performance, so afaict, the description in the new gai.conf,

# sortv4  yes|no
#If set to no, getaddrinfo(3) will ignore IPv4 adresses in rule 9.  See
#section 6 in RFC 3484.  The default is yes.  Setting this option to 
#no breaks conformance to RFC 3484.

is incorrect, in that that the implementation is still in conformance
with the RFC.

In addition, I think there's two different aspects here: the first is
should getaddrinfo() return results in random order to aid in load
distribution? and the second is is prefix matching a reasonable way
to determine a good host to use?

AFAICS, the answer to the first question is simply no, it shouldn't --
randomised load balancing like that needs to be done at the application
level, or by giving different sets of IPs in response to DNS queries by
different hosts, such as using BGP or similar. As far as pool.ntp.org
is concerned, that looks like the end of the story, afaics: ntp can't
rely in getaddrinfo to give a suitably random answer.

OTOH, getaddrinfo is meant to give a close answer, and doing prefix
matching on NATed addresses isn't the Right Thing. For IPv6, that's fine
because it's handled by earlier scoping rules. For NATed IPv4 though the
prefix we should be using is whatever the host is going to be NATed *to*.
And that would imply that the Right Thing would be to have an option
more like:

pretend-that 10/8 is-really 1.2.3.4/32

That doesn't seem likely to work though because it requires extra
manual configuration, which won't happen.

Giving up on actually getting getaddrinfo to give close answers for
NATed boxes leaves the option of trying to avoid getaddrinfo going out
of its way to give far answers instead, which would mean turning off
prefix-matching for NATed boxes; which could be done by ignoring rule
9 by default for private IPv4 addresses.

Actually, it might also be reasonable to ignore rule 9 if

scope(DA)  scope(source(DA)) and scope(DB)  scope(source(DB))

which seems reasonably equivalent to DA and DB are only reachable through
a NAT for both IPv4 and IPv6. The corner case is if the destination
is in a DMZ and can access both the Internet and local boxes directly,
but I don't think you can get the right answer for that atm anyway.

Doing it by changing Rule 9 to:

   Rule 9:  Use longest matching prefix.
   When DA and DB belong to the same address family (both are IPv6 or
   both are IPv4): If xCommonPrefixLen(DA, Source(DA)) 
   xCommonPrefixLen(DB, Source(DB)), then prefer DA.  Similarly, if
   xCommonPrefixLen(DA, Source(DA))  xCommonPrefixLen(DB, Source(DB)),
   then prefer DB.

   If scope(X)  scope(Y) then
xCommonPrefixLen(X,Y) = 0
   Else:
xCommonPrefixLen(X,Y) = CommonPrefixLen(X,Y)

would give reasonable behaviour, I think (preferring addresses that can
be reached without NAT first, then leaving addresses that require NAT
in the order received).

In essence, the problem is that comparing prefixes of real addresses
against addresses that will be NATed is not adding information, and is
possibly losing information -- eg, if your site DNS already orders A
addresses by prefix matching on your actual IP range.

 I already suggested that maybe rule 9 should be limited to the common
 prefix length of the netmask you're using.  An other option is that you
 extend rule 2 to have the same behaviour with ipv4, and that 10/8,
 172.16/12 and 192.168/16 should be considered organization-local.

Those are specified as having site-local scope in 3.2; but Rule 2 only
comes into play if one of the IPs returned by the nameserver is also
site-local anyway which isn't particularly useful.

Cheers,
aj



signature.asc
Description: Digital signature


Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Pierre Habouzit
On ven, sep 07, 2007 at 07:45:52 +, Pierre Habouzit wrote:
 On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote:
  On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote:
   Pierre Habouzit wrote:
Also note that probably many many Windows machines work that way (the
  RFC was written by a MS guy). And this behaviour impacts software
  developpers, and people that hoped that having multiple A records for
  their service will see a perfect round robin will be stuck anyways. I
  mean, it's non previous-practice-backward-compliant and one can argue
  reasonably it sucks. But hel-llooo ! this kind of design choice is not
  only local. If every one (or the majority) on the internet behaves like
  this, fixing this bug (if it is really one) in Debian will _not_, I
  say _not_ prevent us from fixing many software that rely on DNS round
  robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will
  have to cope with that whatever choice is made.
 
   On that matter, according to Aurélien, Vista (maybe XP),
 {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X
 and solaris come to mind). So it's kind of a decision of Debian vs. the
 rest of the world. And if I don't really care about the issue of the
 decision technically, this aspect worries me.

  Still one technical point, here is the excerpt from the RFC on the
offending rule:
   Rule 9:  Use longest matching prefix.
   When DA and DB belong to the same address family (both are IPv6 or
   both are IPv4): If CommonPrefixLen(DA, Source(DA)) 
   CommonPrefixLen(DB, Source(DB)), then prefer DA.  Similarly, if
   CommonPrefixLen(DA, Source(DA))  CommonPrefixLen(DB, Source(DB)),
   then prefer DB.

  What it means is that for IPs with the same common prefix, the order
of the address is unchanged wrt how it came up in the DNS answer.

  What it means, is that when I use apt to fetch from ftp.debian.org
from my home ISP (proxad) it takes the mirror that proxad does
(ftp.fr.d.o). When I go to my parents, using wanadoo (now Orange), it
picks the Oleane one (ftp.fr2.d.o) which indeed is nearer. It makes
completely sense.

  And as per rule of the common prefix, on a local network, RR still can
be assumed on a given VLAN. It actually makes quite some sense to me.

  Maybe that's why Joey Hess had variability: the RFC does not specify a
*full* ordering, it just aim to restrict the RR to the nearest
servers to the client.


  Of course, usualy ISP IP's have first octet smaller than 127, so if
you host a service with RR on a network with the first octet greater
than 128 and a mirror on an IP with a first octet smaller than 128, the
client of your service from the ISP will never chose the former because
of this rule. This is a RFC that favors people with large mirroring
networks for their service, and hinders people with small mirroring
networks because they have to chose the IP for their network servers
with care.


  I think I've described everything important for the Ctte to rule this,
so unless a question pop up, I'll let you rule in peace :)

-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpfigLMJUEPw.pgp
Description: PGP signature


Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Pierre Habouzit
On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote:
 Pierre Habouzit wrote:
The point is, there is an RFC, and we put a patch so that admins can
  disable it using gai.conf.
 
 There is an RFC is not always a good excuse for breaking existing systems.
 
 Admins can disable it is not a good argument when one common class of
 the breakage is all the systems that _don't_ disable it hammering
 systems that have round-robins set up to distribute load. More
 generally, we added an option so your bug is fixed is a common
 fallacy.

  The point is: the option is here, I don't really care if the Ctte
decides to set true or false by default. My underlying point was just
that the switch is easy for us. SO yeah, if we change the default
option, the bug is definitely fixed and is anything but a fallacy.

  OTOH there is no way upstream will change that (Ulrich refused the
patch with blatant aggressiveness), so every other distribution (Fedora
and RedHat, probably many other) will work that way. So we can rule
everything we want here (and I absolutely don't care about the issue of
the decision, I was just giving some pointers to the pro's as I
assumed that the cons were obvious to anyone), this will not change how
upstream glibc works, so many people (probably a majority ?) will use
this new scheme anyway.

  And I also say that knowing Uli, (and knowing how deeply I care about
this issue ;p) I won't spend a minute trying to argue with Uli, I'm not
insane, and don't have the man-years to do that.

  Also note that probably many many Windows machines work that way (the
RFC was written by a MS guy). And this behaviour impacts software
developpers, and people that hoped that having multiple A records for
their service will see a perfect round robin will be stuck anyways. I
mean, it's non previous-practice-backward-compliant and one can argue
reasonably it sucks. But hel-llooo ! this kind of design choice is not
only local. If every one (or the majority) on the internet behaves like
this, fixing this bug (if it is really one) in Debian will _not_, I
say _not_ prevent us from fixing many software that rely on DNS round
robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will
have to cope with that whatever choice is made.

 BTW, I'm seeing some programs that use getaddrinfo and still don't have
 the RFC 3484 sorting behavior. Is this controlled by the AI_ADDRCONFIG flag?

  TTBOMK it's a bug wrt intended behaviour as per upstream.

-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpf1lydUiqh6.pgp
Description: PGP signature


Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Kurt Roeckx
On Fri, Sep 07, 2007 at 06:54:21PM +1000, Anthony Towns wrote:
 OTOH, getaddrinfo is meant to give a close answer, and doing prefix
 matching on NATed addresses isn't the Right Thing. For IPv6, that's fine
 because it's handled by earlier scoping rules. For NATed IPv4 though the
 prefix we should be using is whatever the host is going to be NATed *to*.
 And that would imply that the Right Thing would be to have an option
 more like:
 
   pretend-that 10/8 is-really 1.2.3.4/32
 
 That doesn't seem likely to work though because it requires extra
 manual configuration, which won't happen.
 
 Giving up on actually getting getaddrinfo to give close answers for
 NATed boxes leaves the option of trying to avoid getaddrinfo going out
 of its way to give far answers instead, which would mean turning off
 prefix-matching for NATed boxes; which could be done by ignoring rule
 9 by default for private IPv4 addresses.

The problem with IPv4 is not only about NAT.  It just happens to show
the problem better.

With the IPv6 allocation policies, it's likely that the more higher bits
match, the closer it is network wise.  It is rather unlikly in the IPv4
case, specially if you go above /16.


Kurt


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



glibc's getaddrinfo() sort order

2007-09-06 Thread Kurt Roeckx
Hi,

I'm not agreeing with the glibc maintainer(s) about wether getaddrinfo()
should sort the results or not.  I think the current way it sorts things
does not work at all in IPv4, and I think it hurts more than it does
good.

I'm seeking input from the tech-ctte on how to handle this.


Kurt



signature.asc
Description: Digital signature


Re: glibc's getaddrinfo() sort order

2007-09-06 Thread Pierre Habouzit
On Thu, Sep 06, 2007 at 10:04:23PM +, Kurt Roeckx wrote:
 Hi,
 
 I'm not agreeing with the glibc maintainer(s) about wether getaddrinfo()
 should sort the results or not.  I think the current way it sorts things
 does not work at all in IPv4, and I think it hurts more than it does
 good.
 
 I'm seeking input from the tech-ctte on how to handle this.

  The point is, there is an RFC, and we put a patch so that admins can
disable it using gai.conf.

  Note also that old calls like gethostbyname still return address
randomly. It's somehow accepted that people using getaddrinfo should be
aware of the RFC requirements wrt orderings, and that applications for
whose DNS resolutions round robins may matter should implement their own
randomization.


  the Ctte may want to read:
  - http://udrepper.livejournal.com/16116.html
  - http://people.redhat.com/drepper/linux-rfc3484.html
-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpXomHNkLCZ5.pgp
Description: PGP signature


Re: glibc's getaddrinfo() sort order

2007-09-06 Thread Kurt Roeckx
On Fri, Sep 07, 2007 at 12:34:10AM +0200, Pierre Habouzit wrote:
   the Ctte may want to read:
   - http://udrepper.livejournal.com/16116.html
   - http://people.redhat.com/drepper/linux-rfc3484.html

The first one makes a point to which I party agree, but also disagree.

It's atleast in the spirit of the rfc to prefer one that's on the local
network.  It might be the intention of rule 9, but then rule 9 isn't
very well written.

In the case the server has 2 addresses assigned, I doubt that you're
going to advertise the local one outside.  So you're atleast have a
different response for an internal and external query.  I don't see
why the interal query should also return the external address.

I already suggested that maybe rule 9 should be limited to the common
prefix length of the netmask you're using.  An other option is that you
extend rule 2 to have the same behaviour with ipv4, and that 10/8,
172.16/12 and 192.168/16 should be considered organization-local.

Ulrich Drepper actually called site-local in the second document, but
I think organization-local would be the right scope for it.


Kurt


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-06 Thread Joey Hess
Pierre Habouzit wrote:
   The point is, there is an RFC, and we put a patch so that admins can
 disable it using gai.conf.

There is an RFC is not always a good excuse for breaking existing systems.

Admins can disable it is not a good argument when one common class of
the breakage is all the systems that _don't_ disable it hammering
systems that have round-robins set up to distribute load. More
generally, we added an option so your bug is fixed is a common
fallacy.

   Note also that old calls like gethostbyname still return address
 randomly. It's somehow accepted that people using getaddrinfo should be
 aware of the RFC requirements wrt orderings, and that applications for
 whose DNS resolutions round robins may matter should implement their own
 randomization.

getaddrinfo was around for many years before RFC 3484. It's been in
glibc since 1996. So you're saying that developers writing code in
the 90's should have somehow been aware of an RFC that was published in
2003.


BTW, I'm seeing some programs that use getaddrinfo and still don't have
the RFC 3484 sorting behavior. Is this controlled by the AI_ADDRCONFIG flag?

-- 
see shy jo


signature.asc
Description: Digital signature