On Fri, 6 Mar 2020 at 07:27, Aleksei Ivanov <iv.aleks...@gmail.com> wrote: > > > What do you mean "just one syscall"? The entire point here is that it'd > > take more syscalls to send the same amount of data. > > I mean that it messages are large enough more than 2K we will need 4 syscalls > without copy it to the internal buffer, but currently we will copy 8K of > messages and send it using 1 call. I think that under some threshold of > packet length it is redundant to copy it to internal buffer and the data can > be sent directly.
I think what you're suggesting is more complex than you may expect. PostgreSQL is single threaded and relies pretty heavily on the ability to buffer internally. It also expects its network I/O to always succeed. Just switching to directly doing nonblocking I/O is not very feasible. Changing the network I/O paths may expose a lot more opportunities for send vs receive deadlocks. It also complicates the protocol's handling of message boundaries, since failures and interruptions can occur at more points. Have you measured anything that suggests that our admittedly inefficient multiple handling of send buffers is performance-significant compared to the vast amount of memory allocation and copying we do all over the place elsewhere? Do you have a concrete reason to want to remove this? If I had to change this model I'd probably be looking at an iovector-style approach, like we use with shm_mq. Assemble an array of buffer descriptors pointing to short, usually statically allocated buffers and populate one with each pqformat step. Then when the message is assembled, use writev(2) or similar to dispatch it. Maybe do some automatic early flushing if the buffer space overflows. But that might need a protocol extension so we had a way to recover after interrupted sending of a partial message... -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise