There's an odd corner case in both the Nevada and Clearview IPMP bits that
I'd like to get some input on.  Specifically, in general, we do not
support a group consisting only of "standby" interfaces -- at least one
interface must be designated as "active" by the admin, via either
ifconfig(1M) or the /etc/hostname.* configuration files.

However, it's possible that we may be unable to plumb a given interface at
boot (e.g., because FMA has disabled it).  The boot logic automatically
handles bringing up the unplumbable interface's IP data addresses on the
IPMP interface.  But if that interface was the "active" half of a active/
standby group, we have a problem, since the group now effectively consists
only of standby interfaces (and will thus be unreachable).  In general,
it's not possible for in.mpathd to detect and "correct" this case because
it's possible for this to merely be a transient case -- e.g., it may be
that the "active" half of the group has not yet been plumbed (perhaps
interactively by an administrator, which may take some time).

For Nevada, this is handled by having the boot scripts always clear the
standby flag on whatever IP interface ends up hosting the unplumbable
interface's IP addresses.  Thus, the IPMP group becomes usable, though the
administrator may be confused to see the standby flag (rather than the
inactive flag) cleared on the remaining interface.  With Nevada, this was
the best we could do because the inactive flag was owned by the kernel.

With Clearview IPMP, userland controls the inactive flag.  Thus, we could
handle the above case by clearing the inactive flag, but it'd need to be
done inside in.mpathd since it "owns" that flag and will restore it to
whatever value it believes is correct, effectively undoing another
application's clear.  Thus, to solve the problem, we'd need to introduce a
new MI_ACTIVATE messaging primitive, along with a mechanism for the boot
scripts to send this message to in.mpathd as appropriate.  Alternatively,
we could have in.mpathd receive a notification (e.g., sysevent) that
boot-up network configuration is complete and it can handle the standby
case internally.  Or we could go with the same skanky approach that Nevada
currently uses.

I'm leaning towards the skanky Nevada approach, given that this seems this
*is* a corner case and the other solutions feel a bit over-engineered and
create a fair bit of code that will probably rot.  Thoughts?


-- 
meem

Reply via email to