On Tue, Apr 16, 2013 at 9:30 AM, Roland Dreier <[email protected]> wrote: > From: Roland Dreier <[email protected]> > > Markus Stockhausen <[email protected]> noticed that IPoIB was > spending significant time doing memcpy() in __pskb_pull_tail(). He > found that this is because his adapter reports a maximum MTU of 4K, > which causes IPoIB datagram mode to receive all the actual data in a > separate page in the fragment list. > > We're already allocating extra tailroom for the skb linear part, so we > might as well use it. In fact, we might as well allocate a big enough > linear part so that all the data fits there in the relatively common > case of a 2K IB MTU, and only use a fragment page for 4K IB MTU. > > Cc: Eric Dumazet <[email protected]> > Reported-by: Markus Stockhausen <[email protected]> > Signed-off-by: Roland Dreier <[email protected]> > --- > v4: Leave enough space in linear part of skb so that all data ends up > there with 2K IB MTU. Still not sure how this affects perf with a > 4K IB MTU (should be better since we avoid pulling IP headers out of > first fragment). >
I am afraid I don't understand what the issue is. the pull_tail() in itself is not a performance issue : Intel guys only fixed last gays ago fact that IGB/IXGBE drivers were not pulling tcp headers in skb->head , and nobody noticed. Real cost is the cache line miss. Now, if you pull too many bytes in skb->head, say part of TCP payload, you lose opportunities in TCP coalescing or splice(). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
