Hi Peter,
Peter Memishian wrote:
> (I've added a third question, below -- resending on the hopes that we canb
> cover them all in one go. Thanks again.)
>
> While working on the IPMP Rearchitecture[1] code, I've hit upon a few
> issues that need your input. Specifically:
>
> 1. Does Sun Cluster make use of either the IPMP query API
> (PSARC/2002/615) or IPMP Asynchronous Events (PSARC/2002/137)?
> (I know there is a contract, but I'm curious about shipping
> code.) I ask because I've had to introduce some (compatible)
> changes to these interfaces, but there are some incompatible
> changes that I'd also like to make to simplify the code.
> However, if Sun Cluster is using one or both of them, I'll
> hold off on those incompatible changes.
>
We are not using any of those in the shipping product.
> 2. I've had to rework the boot code such that in.mpathd starts
> earlier in boot (as you may recall, currently it is delayed
> via the SUNW_NO_MPATHD environment variable). If I'm reading
> PSARC/2005/142 correctly, I believe the Sun Cluster IPMP SMF
> service depends on the SUNW_NO_MPATHD behavior. As such, I
> think we need to revisit the way the Sun Cluster IPMP SMF
> service works, or introduce a full-fledged Solaris IPMP SMF
> service to subsume the Sun Cluster one. (The latter has been
> planned for some time, but I was hoping to avoid dragging it
> into this already-massive project.) Thoughts?
>
Yes, network/multipath wouldn't like it when mpathd is already started.
Keep in mind that SC only cares about mpathd being monitored and
restarted if it crashes. If you move mpathd startup earlier, but make it
restartable, then we could even remove the SC SMF service altogether,
which was meant as an interim solution before mpathd restarter is
implemented in Solaris.
I understand the IPMPng is getting massive, but then making it
restartable seems fundamental enough that it does belong there.
> 3. Per PSARC/2002/763, Sun Cluster has a contract to use
> SIOCSLIFGROUPNAME, SIOCGLIFGROUPNAME, lifr_groupname,
> IFF_NOFAILOVER, IFF_STANDBY, IFF_FAILED, IFF_OFFLINE, and
> IFF_INACTIVE. The semantics of several of these will be
> changing. So, as above, does Sun Cluster have shipping code
> that relies on the semantics of these interfaces? If so,
> could you elaborate on the specific dependencies?
>
SIOCGLIFGROUPNAME
Sun Cluster tracks IPMP group states using its internal data
structures built from mapping individual interfaces to their IPMP groups.
SIOCSLIFGROUPNAME
In some cases, SC automatically creates singleton IPMP groups. This
ioctl, plus IFF_NOFAILOVER on the (only) interface, are used.
IFF_FAILED
If all interfaces in an IPMP group has this marked, SC declares the
IPMP group has failed and migrates all addresses to a backup node.
IFF_RUNNING
If cleared, SC considers interface failed due to link-layer faults.
Failover strategy same as IFF_FAILED, except that some internal timeouts
are shortened.
IFF_INACTIVE, IFF_STANDBY, IFF_OFFLINE
SC has its own status reporting tools for IPMP groups and member
interfaces. These flags are for reporting only.
Hope this helps. Let me know if you need more info.
Thanks.
Honsing
> [1] The high-level design document hasn't yet been updated to cover all of
> these issues. However, for general background, please see:
> http://opensolaris.org/os/community/networking/ipmp-highlevel-design.pdf
>
>