On Mon, Feb 27, 2017 at 12:16 PM, Jacob Champion <[email protected]> wrote: > > On 02/23/2017 04:48 PM, Yann Ylavic wrote: >> On Wed, Feb 22, 2017 at 8:55 PM, Daniel Lescohier wrote: >>> >>> >>> IOW: read():Three copies: copy from filesystem cache to httpd >>> read() buffer to encrypted-data buffer to kernel socket buffer. > >> >> Not really, "copy from filesystem cache to httpd read() buffer" is >> likely mapping to userspace, so no copy (on read) here. > > Oh, cool. Which kernels do this? It seems like the VM tricks would have to > be incredibly intricate for this to work; reads typically don't happen in > page-sized chunks, nor to aligned addresses. Linux in particular has > comments in the source explaining that they *don't* do it for other syscalls > (e.g. vmsplice)... but I don't have much experience with non-Linux systems.
I don't understand this claim. If read() returned an API-provisioned buffer, it could point wherever it liked, including a 4k page. As things stand the void* (or char*) of the read() buffer is at an arbitrary offset, no common OS I'm familiar with maps a page to a non-page-aligned address. The kernel socket send[v]() call might avoid copy in the direct-send case, depending on the implementation.
