Re: CVS commit: src/sys (and netbsd-8 stalled pullup request #390)

Robert Elz Mon, 27 Nov 2017 12:29:24 -0800

    Date:        Mon, 27 Nov 2017 09:42:47 +0000
    From:        Roy Marples <[email protected]>
    Message-ID:  <[email protected]>


  | I don't understand the change to ip_output.c (ie why yours works and 
  | mine doesn't) but as long as it works then I'm happy.

That one is easy, ifa_ifwithaddr() needs a struct sockaddr * as its
parameter, you were passing it a (cast to struct sockaddr *) struct in6_addr *
instead, which is an entirely different beast.   That's why the sin6 variable
existed in ip6_ifaddrvalid() in the first place, and was applied to the src
addr (which was the only one checked before) - fortunately the needs don't
overlap, so the same sin6 var could be used for both tests (which saves
most of the init that would be needed if a new one was required) - the
exceedingly ugly way I made the change was just so the copy of the dest
addr into sin6 would be done only when absolutely required.  Rewriting the
code more would probably allow that to be cleaned up.

This type error isn't detected by the compiler because of the use of the
sin6tosa() macro - which converts struct sockaddr_in6 * pointers into
struct sockaddr * pointers (ie: it adds a cast) ... I considered changing
that to be a static inline function, so the compiler could do arg type
verification, but at the time going and checking every use (to make sure
it is never used with some other sockaddr * type - like a sockaddr_storage
perhaps) was too much... (The right fix would be to have a static inline
function for each sockaddr pointer type that we want to be able to treat as a
generic struct sockaddr pointer, and then always use the appropriate one,
but that means lots of code churn I suspect.)

  | > I still don't believe that the way it works is the way it should,
  | > but that's a topic for another day.
  | 
  | Are you talking about the functional change or the code itself?

Perhaps both.   There are 3 changes I would suggest to what is there now,
one of which I suspect that you will agree with (and might be considered
just a change to the code itself, though it does make a functional difference,)
one which I thought you agreed with, as it was the way that the code was
written before Friday, and one which I am fairly confident that you do not
agree with... (yet anyway!)

Easiest first, you changed the rules for source addr selection,
in6_select_best_ia() is/was written to follow the rules in RFCmumble
almost exactly.

First, it eliminates any candidate addresses that are simply invalid (that's
still being done) then it has a very particular order in which it
compares to potential addresses to see which is best.

You changed detached/tentative from "simply invalid" to "valid but not
to be preferred" but left the test in the part of the code that is to
be dealing with invalid addresses, rejecting them - that's why you needed
the test for ia_best != NULL embedded in the new test - none of the later
tests have that, as if you're comparing two potentially valid addresses,
then you must have two of them, there's no need to test for this being the
first one (there is a separate test that says "if this is the first valid
addr we have seen it must be the best one so far" that deals with that case.

Further, by leaving it there, you're preferring a non detached/tentative
addr with the wrong scope to a detached/tentative addr with the correct
score (ie: a fully valid link-local wins over a detached/tentative global)
- I doubt that you really intended to do that.

The new test should be moved down to rule 3, probably just before the
deprecated addr test, that is, a deprecated addr which is still valid
should be preferred over a detached (or tentative) addr  - detached I
will come back to in the next section, but for tentative this means that
if an old addr has become deprecated, and we're in the process of generating
a new one, but DaD has not completed yet, we keep using the old addr
until the new one is no longer tentative, and then switch (the way your
code is now would also do that - but because the check is much too violent.)
This is why you might consider this just to be a coding change (with functional
side effects.)

But there are also arguments for doing those tests the other way, and
preferring a tentative addr over a deprecated addr for new connections
(the new connection might last beyond the period that the deprecated addr
remains valid, whereas the tentative addr is very unlikely to fail DaD and
should remain valid considerably longer.)   I might even make that choice
depend upon the upper level protocol - prefer deprecated for icmp/udp sending,
and prefer tentative for tcp (and for ucp connecting - but the kernel doesn't
make the choice in the latter case, the app does, so we can ignore that.)

Second - the one I thought you agreed with - is the treatment of deprecated
addresses.  I preferred the way the code was last week (before Friday),
that is, exactly the opposite to the complaint from Robert Swindells.
Addresses belong to interfaces, if the interface is non-functional, its
address is (or should be) non-functional as well - including for
applications on the local host.

The "route to lo0" that he kept bringing up is just an implementation
artifact, it makes it easier to handle packets addressed to the local host,
as they all end up at lo0 just using the routing code, we don't need extra
tests all over the place with "or if this address is mine..." to handle that,
and aside from that (internal) use, is (or should be) irrelevant to
everything else.

But if another host cannot connect to the addr I have assigned to
interface xx0 then I should not be able to either - it makes local testing
that all my interfaces are working much easier (just ping all of my
addresses and make sure they're all replying, exactly the same way I
would test them from a remote host - which means I can write an application
that validates the network, and not need to special case testing for
addresses that happen to belong to the host running the tests.)

So, I would return all the detached address tests to the way they were
early last week, it is simply better that way.

Third - the one I think you do not agree with - tentative addresses are
not invalid, they're more like deprecated addresses than detached ones,
and should be treated just like deprecated addresses (as an address we
would prefer not to use if possible, but not in any way invalid.)
As above, for some uses a tentative addr is even better to use than a
deprecated one (whereas for other uses the opposite applies) and of course
an addr that is neither deprecated nor tentative is better than both.

So, while it is entirely reasonable to forbid sending or receiving
packets from (or to) detached addresses, packets to/from tentative
addrs should simply be allowed always (we know we *must* allow sending
from a tentative addr to implement DaD correctly - we must defend the
tentative addr from other hosts' attempts to also DaD the same addr
at the same time, and that means replying to an NDP probe claimimg
that we own the addr already (as they will reply to our probe saying they
own the addr already) and that means sending from the tentative addr.
[Binding to a tentative addr should be allowed, always, as well.]

But it is more than that, a tentative addr is 99.9% (perhaps much higher
in the v6 world) to be validated in a second or so (maybe 2 or 3 if the
host admin is paranoid - and which should probably be done to all manually
assigned addrs - humans make far more mistakes than RNGs or mac addr
assignments) and delaying their use just creates complications.

I would allow treating them just like we treat deprecated addresses, with
a sysctl to say whether or not we should ever use them (which I would just
turn on in sysctl.conf and you can leave turned off if you like) - provided
that we make sure we have not broken DaD - including DaD implemented in a
userland daemon rather than in the kernel, regardless of the state of the
sysctl -- and I certainly support use of "ifconfig -w" to wait for the
addresses to have been validated before using them, if that is what the app
needs.

Aside: is there a way with that to wait for a particular addr to become valid?
The (ifconfig) man page reads like there isn't - not even to wait for addrs 
on a particular interface, but the code might be different - if not, then I
think we need to allow the -i ifconfig option to accompany with -w (and -W)
to apply to just one interface, and perhaps also an addr option, to wait for
just that addr - I'd have thought that would be useful when adding a new
addr just assigned by a dhcp server, we don't really care about any other
addresses that might, already, or while we're waiting, enter DaD, just
the one being assigned right now.


Lastly, I certainly do not think that any of these changes should be pulled
up to -8 until all of this is sorted out.   If ever.

kre

ps: I only really care about any of this in the v6 world, v4 is dead/dying
anyway, regardless of what some people prefer to think or hope.

Re: CVS commit: src/sys (and netbsd-8 stalled pullup request #390)

Reply via email to