From: Soheil Hassas Yeganeh <[email protected]> Date: Mon, 14 Sep 2020 17:52:10 -0400
> From: Soheil Hassas Yeganeh <[email protected]> > > For EPOLLET, applications must call sendmsg until they get EAGAIN. > Otherwise, there is no guarantee that EPOLLOUT is sent if there was > a failure upon memory allocation. > > As a result on high-speed NICs, userspace observes multiple small > sendmsgs after a partial sendmsg until EAGAIN, since TCP can send > 1-2 TSOs in between two sendmsg syscalls: > > // One large partial send due to memory allocation failure. > sendmsg(20MB) = 2MB > // Many small sends until EAGAIN. > sendmsg(18MB) = 64KB > sendmsg(17.9MB) = 128KB > sendmsg(17.8MB) = 64KB > ... > sendmsg(...) = EAGAIN > // At this point, userspace can assume an EPOLLOUT. > > To fix this, set the SOCK_NOSPACE on all partial sendmsg scenarios > to guarantee that we send EPOLLOUT after partial sendmsg. > > After this commit userspace can assume that it will receive an EPOLLOUT > after the first partial sendmsg. This EPOLLOUT will benefit from > sk_stream_write_space() logic delaying the EPOLLOUT until significant > space is available in write queue. > > Signed-off-by: Eric Dumazet <[email protected]> > Signed-off-by: Soheil Hassas Yeganeh <[email protected]> Applied.
