Re: [networking-discuss] IP multicast, and other thoughts

James Carlson Tue, 05 Jun 2007 05:28:19 -0700

Garrett D'Amore writes:
> > See ip_rput_v6 for some code for which I profusely apologize (right
> > below the comment about CR 6451644).
> >   
> 
> Yikes!  What a mess.


Indeed.

>  And that code has a non-zero (not non-trivial) 
> performance impact on the hottest IPv6 code path.  :-(

I dare you to measure it.  Seriously, it adds a few trivial checks to
that code path, and the bulk of that code (past the assignment to
ll_multicast) is simply never used -- it's multicast only, not bulk
unicast data.

I agree it's not pretty, but given all the work the stack actually
does, and the invasiveness of the corresponding Nemo fix, I'd like to
see actual numbers before accepting any such an assertion.  I strongly
suspect it's a wash.

> The fix you have in that code is also inadequate, since here is no 
> guarantee that the MBLKHEAD(mp)  is nonzero.  For example, if  a message 
> must be pulled up to address IP header alignment, then it will lose the 
> ethernet header.

It's the best that could be done under the circumstances.  If you look
at the actual usage, you'll see that the failure mode is that we
potentially misidentify an external probe conflict as one of our own
messages and thus ignore it.  That'd be unfortunate, but it's not a
complete disaster.

Note also that we take the slower path (with message allocation) only
when the message is non-unicast and thus only when the underlying
driver is using Nemo's broken inbound logic.  We know the structure of
all Nemo-based drivers, because it's not public, thus we can predict
exactly which (if any) of them would fail in the way you suggest.
None do, as best I can tell.

> Which brings me to my second question, which is, apart from knowing the 
> unicast/nonunicast state, is the information in the DL_UNITDATA_IND 
> actually useful?   I.e. are there any cases, outside maybe of debugging 
> that would be performed with DTrace or somesuch, where that information 
> is likely to be logged or tracked somewhere?

None that I know of.  Of course, we're just looking at IP here.
DLPIv2 is a standard, and there are certainly other things written
atop it that may not necessarily share IP's mostly agnostic view of
L2.

> I'm seriously thinking more and more that the multicast/group address 
> state should just be passed with the M_DATA packet, rather than having 
> to go to the same kinds of pains that you've already done.  Especially 
> since the GLDv3 layer has to do this _already_ (so that it can properly 
> check for matches in its own multicast address filters!)

That'd probably be handy (and much like BSD).  I don't think it was
done originally because (at a guess) the original fast-path designers
didn't see much of a reason to optimize the rarely-used cases.  In
other words, fast-path focuses on unicast data, and particularly on
TCP, which is the hot path through the stack, and represents the great
bulk of the traffic we care about.  Having a tiny fraction of the
traffic go at "regular speed" rather than "fast" is a fair trade-off
against having the complexity of multiple mechanisms to do the same
thing (e.g., supporting both flags on M_DATA and DL_UNITDATA_IND, at
the driver's discretion).

Note that one of the things I'm relying on here is that
DL_UNITDATA_IND is purely an inbound (off the wire) phenomenon.  You
don't see it on our own transmitted packets, so it provides a way to
distinguish "sent by us to a group address and looped internally" from
"seen on the wire from someone else."

-- 
James Carlson, Solaris Networking              <[EMAIL PROTECTED]>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] IP multicast, and other thoughts

Reply via email to