lianep at eng.sun.com writes:
> My best suggestion at this point is to modify your service's startup 
> method to explicitly wait on the condition you need to be true -- that 
> the network you need is available.  (Surely Jim will come tell me why 
> this is a bad solution in the world of unreliable networks, but 
> hopefully he'll also show up with a better suggestion. :) )

So, you have a dependency on me to do that?  ;-}

There are several issues here that conspire to cause trouble.  Among
the important ones:

  - We currently refuse to install a route if the next hop address is
    not directly reachable[1].  This means that we have inconsistent
    functionality: if there are no matching interfaces or the best
    matching ones are marked 'down,' we return ENETUNREACH; but if you
    install the route on an 'up' interface, and the interface goes
    down two milliseconds later, that's just fine.  (And with
    RTF_STATIC, we store it in an inaccessible kernel cache to come
    back in a zombie-like way when the interface comes back.  You
    can't get rid of these George Romero creatures with a mere "route
    delete.")

    For Zones, this means that if zoneadmd hasn't configured the
    interfaces, then attempts to configure routes based on those
    interfaces will fail.  That includes the new persistent route
    option.

    For SMF, I'm not sure why this should be a problem.  Zoneadm boot
    should block until zoneadmd has started the boot process, and even
    though the zone itself isn't entirely up at this point, the
    networking and file systems *have* been set up.  If that's not
    working -- if depending on svc:/system/zones:default isn't enough
    -- then I'd consider it a bug that needs to be investigated.  It
    ought to be fixable, and you shouldn't have to have separate
    waits.

  - Interfaces can be configured by the network itself -- as in RARP,
    BOOTP, DHCP, and other such protocols.  In fact, configuring by
    way of the network is the _norm_ for IPv6 (though at least with
    IPv6, routing next hop addresses are usually link-locals).

    There's no way to predict when (or if) those protocols will
    actually "succeed," and success itself is a transitory thing.
    This means that any mechanism that depends on seeing "success" as
    one stage of a boot process is architecturally flawed.  Though the
    sort of mechanism proposed here will "work" for statically
    addressed interfaces, it won't work for dynamic ones and, though
    we don't yet have DHCP in zones, I'd still say that inventing a
    mechanism that supports only static configuration is senseless in
    the longer term.

    This points towards using dynamic routing protocols to solve the
    problem.  Fortunately, this is exactly the sort of problem they're
    designed to resolve.

  - Zones weren't really designed to be on separate networks from the
    beginning, though this is obviously evolving over time.  The
    original design assumed that the non-global zones were just the
    same as placing multiple boxes on the *same* network.  Thus,
    routing was assumed to be a global zone issue alone, and not as
    much attention was paid to sequencing per-zone route creation.

    I'm skeptical of zones placed on different networks without
    actually separate IP instances being involved.  The sort of
    deliberate segregation implied by having distinct networks seems
    incompatible with a shared stack.


[1] Yes, there are still other functional bugs buried here.  Next hop
    addresses are a requirement only for broadcast and NBMA networks,
    not for point-to-point.  Technically, all competent routing
    protocols should (and do) specify an output interface *at all
    times* on routes inserted into the kernel FIB, along with a next
    hop address when necessary for L2 resolution.  Solaris, following
    BSD, has this exactly backwards, and allows a next hop address
    alone to be specified on a route, with the output interface being
    an "optional" parameter that's merely implied by a forwarding
    lookup if absent.  Yes, "route add default 1.2.3.4" really is
    "wrong" (incompatible with IP forwarding design) even though it's
    familiar.

-- 
James Carlson, Solaris Networking              <james.d.carlson at sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Reply via email to