Richard L. Hamilton wrote:
> Or maybe the "proper" solution is for broadcast/multicast software
> to implement proper PMTUD at user-level (trying messages with the
> DF flag set, looking at ipm_nextmtu if in receipt of an 
> ICMP_UNREACH_NEEDFRAG message, etc?  In which case, the question
> ends up being "is the impact of that change on application software
> acceptable"?

For broadcast and well-known multicast -- where there's intentionally no
expectation of any forwarding because the application protocols involved
are intentionally designed without forwarding -- I don't think it makes
sense to delve into the explicitly per-hop MTU discovery mechanism
(where "P" equals "path").  Just from a design perspective, you should
be able to get everything you need to know from the interface data.

More generally, PMTUD with multicast is frightening.  You get a
potentially unbounded set of "updates" with each packet you send and
(without a connection) they're unfiltered.  It's basically a DDoS attack
on yourself.  It doesn't seem possible to get only locally-generated
"path" information with the defined v6 interfaces (and nothing at all
with v4).

So, I'm not sure that's the right answer unless it's explicitly the
right answer.  In other words, if you're writing a multicast application
that relies on multicast forwarding in your network, and you believe you
need to send packets larger than the maximum unfragmented 576/1280
limit, and you want to have the size tuned dynamically to avoid
fragmentation, then I suspect you have no choice but to look at PMTUD.
But if one of those things is untrue, then you have other (and likely
safer) options.

> In summary, if someone is paying for InfiniBand, they probably will
> expect serious performance.  Failing to take advantage of the
> higher MTU possible with unicast IPoIB-CM would shortchange them;
> but misleading multicast/broadcast clients would also be sub-optimal.

The chance that I'll run into multicast applications running with IPoIB
is low enough that I doubt it'll ever be an issue for me personally.  I
was just commenting on the architectural gap.  ;-}

-- 
James Carlson         42.703N 71.076W         <carls...@workingcode.com>
_______________________________________________
opensolaris-arc mailing list
opensolaris-arc@opensolaris.org

Reply via email to