On Fri, 2008-04-25 at 04:05 -0400, Peter Memishian wrote: > Seb, > > While sanity-testing the merged Cleaview IPMP bits, I stumbled on an > odd problem: in.mpathd repeatedly offlined all but one interface in the > group, stating: > > Apr 25 03:34:59 whitestar1-2.East.Sun.COM in.mpathd[100049]: IP interface > ce0 has a hardware address which is not unique in group a; offlining > > Some digging around revealed the problem: the Nevada DL_NOTE_PHYS_ADDR > mechanism has been expanded to also report changes to point-to-point > destination addresses via MAC_NOTE_DEST. Thus, when in.mpathd enables > DL_NOTIFY_INDs for DL_NOTE_PHYS_ADDR (via dlpi_enabnotify()), the > DL_NOTIFY_REQ leads to DL_NOTIFY_INDs being generated for both > DL_CURR_PHYS_ADDR and DL_CURR_DEST_ADDR (via dld_str_notify_ind()).
I think part of the the bug is that DL_NOTE_PHYS_ADDR/DL_CURR_DEST_ADDR is being generated for a datalink that has no destination. The str_notify() function should probably have some way of skipping this notification in the MAC_NOTE_DEST case. If this notification weren't received on ce0, then in.mpathd wouldn't have tripped over it. This bug can probably be fixed independently of what's discussed below. > Further, the dlpi_notifyinfo_t generated for these two events by libdlpi > cannot be differentiated, as the dl_data information is lost in > translation. Thus, the DL_CURR_DEST_ADDR message looks like a > notification that the hardware address has become zero, leading to > in.mpathd believing that all interfaces have a hardware address of zero. I think it's somewhat wrong that dlpi_notifyinfo_t can't differentiate between different types of DL_NOTE_PHYS_ADDR notifications. This notification already supported DL_IPV6_LINK_LAYER_ADDR and DL_CURR_PHYS_ADDR before I added DL_CURR_DEST_ADDR, but on the other hand, DL_NOTE_PHYS_ADDR already has a lame design, and maybe we shouldn't extend it any further... > Needless to say, this needs some work. Minimally, it seems libdlpi needs > to be enhanced to allow callbacks to tell the difference between > DL_CURR_PHYS_ADDR and DL_CURR_DEST_ADDR. One idea would be to filter out notifications other than those for DL_CURR_PHYS_ADDR for now, as ip is really the only intended consumer of DL_CURR_DEST_ADDR (and of the existing DL_IPV6_LINK_LAYER_ADDR!). But given what you've written below, that might not be necessary... > Going further, it's unclear to > me if there are consumers that care about both DL_CURR_PHYS_ADDR and > DL_CURR_DEST_ADDR. That is, does it make sense to have both of these > show up as DL_NOTE_PHYS_ADDR, or would it be better to have a new > DL_NOTE_DEST_ADDR message type? (With such a change, the libdlpi change > would no longer be mandatory, but would still probably be worthwhile.) My first reaction to this was that it feels wrong to design the DLPI protocol around libdlpi, but thinking about this further, I don't have a problem adding a DL_NOTE_DEST_ADDR given that I'm not at all enamored with the existing DL_NOTE_PHYS_ADDR design. Let's proceed down this path. As for DL_IPV6_LINK_LAYER_ADDR, I think it's an ATM specific thing (I wasn't privy to the design nor implementation of this thing), maybe libdlpi could just drop these on the floor? > > PS. Given that this needs to be resolved before the IPMP bits can be > used, I'm holding off on publishing new archives just yet. Understood. Why don't I implement DL_NOTE_DEST_ADDR and put this back to the clearview gate, and we can proceed from there. -Seb
