I feel like we are talking about different things here:
The ***IP*** MTU is relevant for IPoIB performance because it determines the number of times that you are going to be hit by the per-packet overhead of the ***host*** networking stack. My point was that the ***IP MTU*** will not be tied to the ***IB*** MTU if a connected mode IPoIB (or SDP) is used instead of the current IPoIB that uses IB UD transport service. The IB MTU would then be irrelevant to this discussion.
As for the eventual 2G ***IP*** MTU limit, it still sounds more than reasonable to me. I wouldn't mind if a 10TB file gets split into some IP packets up to 2GB?!?!? each.
Keep in mind that IP has a limit on its datagram size (normal and jumbo datagrams) which is far below 2GB. IP datagrams are datagrams. Large messages are expected to use SAR across a set of datagrams to insure forward progress with minimal impact to overall performance in the event of a transmission error.
(With the exception of the UD transport service where IB messages are limited to be single packet), the choice of ***IB*** MTU and its impact on performance is a completely unrelated issue. IB messages are split into packets and reassembled by the HCA HW. So the per-IB-message overhead of the SW stack is independent of the IB MTU. The choice of IB MTU may indeed affect performance for other reasons but it is not immediately obvious that the largest available IB MTU is the best option for all cases. For example, latency optimization of small high priority packets under load may benefit from smaller IB MTUs (e.g. 256).
This is best handled by VL arbitration. Changing the IB MTU to 256 for a UD based implementation would violate the IP minimum datagram size requirement.
Mike
Diego
_______________________________________________
- -----Original Message-----
- From: Stephen Poole [ mailto:[EMAIL PROTECTED]]
- Sent: Thursday, January 06, 2005 5:45 AM
- To: Diego Crupnicoff
- Cc: '[email protected]'
- Subject: RE: [openib-general] ip over ib throughtput
- Have you done any "load" analysis of a 2K .vs. 4K MTU ? Your analogy of having 2G as a total message size is potentially flawed. You seem to assume that 2G is the end-all in size, it is not. What about when you want to (down the road) use IB for files in the 1-10TB in size. Granted, we can live with 2G, but it is not some nirvana number. Second, with the 2G limit on messages sizes, only determines the upper bound in overall size, I could send 2G @ 32bytes MTU. So, the question is, how much less of a system load/impact would a 4K MTU be over a 2K MTU. Remember, even Ethernet finally decided to go to Jumbo Frames, why, system impact and more. Remember HIPPI/GSN, the MTU was 64K, reason, system impact. The numbers I have seen running IPoIB really impact the system.
- Steve...
- At 10:38 AM -0800 1/5/05, Diego Crupnicoff wrote:
- Note however that the relevant IB limit is the max ***message size*** which happens to be equal to the ***IB*** MTU for the current IPoIB (that runs on top of IB UD transport service where IB messages are limited to a single packet).
- A connected mode IPoIB (that runs on top of IB RC/UC transport service) would allow IB messages up to 2GB long. That will allow for much larger (effectively as large as you may ever dream of) ***IP*** MTUs, regardless of the underlying IB MTU.
- Diego
- > -----Original Message-----
- > From: Hal Rosenstock [mailto:[EMAIL PROTECTED]]
- > Sent: Wednesday, January 05, 2005 2:21 PM
- > To: Peter Buckingham
- > Cc: [email protected]
- > Subject: Re: [openib-general] ip over ib throughtput
- >
- >
- > On Wed, 2005-01-05 at 12:23, Peter Buckingham wrote:
- > > stupid question: why are we limited to a 2K MTU for IPoIB?
- >
- > The IB max MTU is 4K. The current HCAs support a max MTU of 2K.
- >
- > -- Hal
- >
- > _______________________________________________
- > openib-general mailing list
- > [email protected]
- > http://openib.org/mailman/listinfo/openib-> general
- >
- > To
- > unsubscribe, please visit
- > http://openib.org/mailman/listinfo/openib-general
- >
- _______________________________________________
- openib-general mailing list
- [email protected]
- http://openib.org/mailman/listinfo/openib-general
- To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
--- Steve Poole ([EMAIL PROTECTED])
Office: 505.665.9662 - Los Alamos National Laboratory
Cell: 505.699.3807
- CCN - Special Projects / Advanced Development
Fax: 505.665.7793
- P.O. Box 1663, MS B255
- Los Alamos, NM. 87545
- 03149801S
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
_______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
