Thanks for the information, Honsing.  My responses follow.

 > >        1. Does Sun Cluster make use of either the IPMP query API
 > >           (PSARC/2002/615) or IPMP Asynchronous Events (PSARC/2002/137)?
 > >   
 > We are not using any of those in the shipping product.

Great, that makes life a bit easier.

 > >        2. I've had to rework the boot code such that in.mpathd starts
 > >           earlier in boot (as you may recall, currently it is delayed
 > >           via the SUNW_NO_MPATHD environment variable).  If I'm reading
 > >           PSARC/2005/142 correctly, I believe the Sun Cluster IPMP SMF
 > >           service depends on the SUNW_NO_MPATHD behavior.  As such, I
 > >           think we need to revisit the way the Sun Cluster IPMP SMF
 > >           service works, or introduce a full-fledged Solaris IPMP SMF
 > >           service to subsume the Sun Cluster one.  (The latter has been
 > >           planned for some time, but I was hoping to avoid dragging it
 > >           into this already-massive project.)  Thoughts?
 >
 > Yes, network/multipath wouldn't like it when mpathd is already started.
 > Keep in mind that SC only cares about mpathd being monitored and 
 > restarted if it crashes. If you move mpathd startup earlier, but make it 
 > restartable, then we could even remove the SC SMF service altogether, 
 > which was meant as an interim solution before mpathd restarter is 
 > implemented in Solaris.
 > 
 > I understand the IPMPng is getting massive, but then making it 
 > restartable seems fundamental enough that it does belong there.

I agree that we ultimately want to remove the SC SMF restarter.  One
problem here is that ifconfig automatically starts in.mpathd if it's not
running, which is an awkward fit for the SMF model.  So I'd like to be
sure there are no other viable options before venturing off into this
problem space.  One obvious possibility that occurs to me is to start
network/multipath earlier (e.g., between network/loopback and
network/physical) -- is that impossible for some reason?

 > SIOCGLIFGROUPNAME
 >     Sun Cluster tracks IPMP group states using its internal data     
 > structures built from mapping individual interfaces to their IPMP groups.

That usage should be OK, though it will need to use LIFC_UNDER_IPMP (and
the SO_RTSIPMP socket option if it's using routing sockets) to be able to
discover the interfaces under IPMP (see the design document for details).

 > SIOCSLIFGROUPNAME
 >     In some cases, SC automatically creates singleton IPMP groups. This 
 > ioctl, plus IFF_NOFAILOVER on the (only) interface, are used.

This one will be problematic -- the call will now fail because there won't
be an IPMP interface for the group yet (ifconfig creates those prior to
calling SIOCSLIFGROUPNAME now).  We'll either need to provide a library
routine for Sun Cluster to call, or Sun Cluster will need to use "ifconfig
<if> group <group> -failover" instead.

 > IFF_FAILED
 >     If all interfaces in an IPMP group has this marked, SC declares the 
 > IPMP group has failed and migrates all addresses to a backup node.

That usage should be OK -- but it should instead be monitoring the IPMP
group interface in the future (group interfaces will have the IFF_IPMP
flag set).

 > IFF_RUNNING
 >     If cleared, SC considers interface failed due to link-layer faults. 
 > Failover strategy same as IFF_FAILED, except that some internal timeouts 
 > are shortened.

Same as IFF_FAILED.  (BTW, if IFF_RUNNING is cleared, IFF_FAILED will also
be set, so I assume this is meant to be a modifier on the IFF_FAILED
handling.)

 > IFF_INACTIVE, IFF_STANDBY, IFF_OFFLINE
 >     SC has its own status reporting tools for IPMP groups and member 
 > interfaces. These flags are for reporting only.

Again, should be OK, but LIFC_UNDER_IPMP and SO_RTSIPMP will be needed in
the future.

--
meem

Reply via email to