On Tue, Aug 07, 2007 at 05:09:53PM -0700, Sean Hefty wrote: > >Well, the MTU isn't explicity carried in the headers, but if you send > >a 2K packet into a path that only supports 1K MTU then it will be > >discarded. In that sense the MTU is included in the headers. > > This is what I meant by the MTU is implied by the other fields. I was > thinking > about it this way. If a PR query contains the SLID, DLID, SL, I would expect > the SA to lookup the MTU for this path and return it. Is there any advantage > or > reason to include the MTU in such a query?
Er, IPoIB does not do that though? It should create a PR with DGID, SGID, TClass, Pkey, etc based on the IP L2 information from the ARP/ND packet. The result of that query is then used to wrap IP datagrams in UD datagrams. Thus it must ask the SA for a path with a minimum MTU large enough to carry the largest IP datagram. > Taking this across subnets, if a PR query contains the SGID, DGID, TC, and FL, > does this change whether the MTU should be specified? I'm not trying to argue > against including the MTU, but I don't know if IPoIB or the SA should specify > it. (And I'm neither an IPoIB nor SA expert.) Ah, well, MTU is really used as both something you request and something the SA returns. In the case of datagram communication the MTU is very important since datagram fragmentation is impossible. In those cases the end points must ask for paths that meet their MTU requirements (there may be switching paths that do not, and without guidance the SA is free to return anything) For RC, MTU is something that should not generally be requested and the returned value from the SA should be used to configure the connection. This gives the SA freedom to return paths across multiple switching paths. This is really only because the RC message size is not impacted by the connection MTU. > >The MTU of every unicast path used must be greater than the Linux > >interface MTU so that the stack produces correctly sized fragments. > > As you mentioned, this doesn't holds for IPoIB-CM. The MTU of the path is > less > than the MTU sent by the stack, and the path MTUs could differ, including > being > less than the broadcast MTU. (I don't know that the implementation supports > different path MTUs, but it could in theory.) Right, RC is handled differently than UD/UC when talking about MTU.. > Couldn't IPoIB fragment the packets if it needed? (Not sure what that would > do > to the performance.) How does IPoIB-CM handle the case where the device MTU > is, > say, 64k, but the remote side only supports UD? There is no provision for fragment identification and reassembly in the IPoIB RFC. AFAIK, when using the 64K MTU setting for IPoIB, if the remote side doesn't support RC then things go wonky. For TCP things *might* be saved by path mtu discovery - but PMTU is driven by ICMP errors which are not generated by an IB network. I suspect that if you use IPoIB a mixed configuration like that you are going to want to have routing table entries that override the MTU for non-RC capable destinations. But I haven't tried this.. Jason _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
