On 11/24/2015 07:49 AM, Eric Dumazet wrote:
But in the end, latencies were bigger, because the application had to copy from kernel to user (read()) the full message in one go. While if you wake up application for every incoming GRO message, we prefill cpu caches, and the last read() only has to copy the remaining part and benefit from hot caches (RFS up2date state, TCP socket structure, but also data in the application)
You can see something similar (at least in terms of latency) when messing about with MTU sizes. For some message sizes - 8KB being a popular one - you will see higher latency on the likes of netperf TCP_RR with JumboFrames than you would with the standard 1500 byte MTU. Something I saw on GbE links years back anyway. I chalked it up to getting better parallelism between the NIC and the host.
Of course the service demands were lower with JumboFrames... rick jones -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html