From: Soheil Hassas Yeganeh <[email protected]>
Date: Mon, 14 Sep 2020 17:52:10 -0400

> From: Soheil Hassas Yeganeh <[email protected]>
> 
> For EPOLLET, applications must call sendmsg until they get EAGAIN.
> Otherwise, there is no guarantee that EPOLLOUT is sent if there was
> a failure upon memory allocation.
> 
> As a result on high-speed NICs, userspace observes multiple small
> sendmsgs after a partial sendmsg until EAGAIN, since TCP can send
> 1-2 TSOs in between two sendmsg syscalls:
> 
> // One large partial send due to memory allocation failure.
> sendmsg(20MB)   = 2MB
> // Many small sends until EAGAIN.
> sendmsg(18MB)   = 64KB
> sendmsg(17.9MB) = 128KB
> sendmsg(17.8MB) = 64KB
> ...
> sendmsg(...)    = EAGAIN
> // At this point, userspace can assume an EPOLLOUT.
> 
> To fix this, set the SOCK_NOSPACE on all partial sendmsg scenarios
> to guarantee that we send EPOLLOUT after partial sendmsg.
> 
> After this commit userspace can assume that it will receive an EPOLLOUT
> after the first partial sendmsg. This EPOLLOUT will benefit from
> sk_stream_write_space() logic delaying the EPOLLOUT until significant
> space is available in write queue.
> 
> Signed-off-by: Eric Dumazet <[email protected]>
> Signed-off-by: Soheil Hassas Yeganeh <[email protected]>

Applied.

Reply via email to