Richard L. Hamilton wrote: > Or maybe the "proper" solution is for broadcast/multicast software > to implement proper PMTUD at user-level (trying messages with the > DF flag set, looking at ipm_nextmtu if in receipt of an > ICMP_UNREACH_NEEDFRAG message, etc? In which case, the question > ends up being "is the impact of that change on application software > acceptable"?
For broadcast and well-known multicast -- where there's intentionally no expectation of any forwarding because the application protocols involved are intentionally designed without forwarding -- I don't think it makes sense to delve into the explicitly per-hop MTU discovery mechanism (where "P" equals "path"). Just from a design perspective, you should be able to get everything you need to know from the interface data. More generally, PMTUD with multicast is frightening. You get a potentially unbounded set of "updates" with each packet you send and (without a connection) they're unfiltered. It's basically a DDoS attack on yourself. It doesn't seem possible to get only locally-generated "path" information with the defined v6 interfaces (and nothing at all with v4). So, I'm not sure that's the right answer unless it's explicitly the right answer. In other words, if you're writing a multicast application that relies on multicast forwarding in your network, and you believe you need to send packets larger than the maximum unfragmented 576/1280 limit, and you want to have the size tuned dynamically to avoid fragmentation, then I suspect you have no choice but to look at PMTUD. But if one of those things is untrue, then you have other (and likely safer) options. > In summary, if someone is paying for InfiniBand, they probably will > expect serious performance. Failing to take advantage of the > higher MTU possible with unicast IPoIB-CM would shortchange them; > but misleading multicast/broadcast clients would also be sub-optimal. The chance that I'll run into multicast applications running with IPoIB is low enough that I doubt it'll ever be an issue for me personally. I was just commenting on the architectural gap. ;-} -- James Carlson 42.703N 71.076W <carls...@workingcode.com> _______________________________________________ opensolaris-arc mailing list opensolaris-arc@opensolaris.org