I find the backend libpq changes related to non-blocking I/O quite complex. Can we find a simpler solution?
The problem we're trying to solve is that while the walsender backend sends a lot of WAL records to the client, the client can send a lot of messages to the backend. If volume of the messages from client to server exceeds both the input buffer in the server and the output buffer in the client, the client will block until the server has read some data. But if the client is blocked, it will not process incoming data from the server, and eventually the server will block too. And we have a deadlock. This: http://florin.bjdean.id.au/docs/omnimark/omni55/docs/html/concept/717.htm is a pretty good description of the problem. The first question is: do we really need to be prepared for that? The XLogRecPtr acknowledgment messages the client sends are very small, and if the client is mindful about not sending them too often - perhaps max 1 ack per 1 received XLOG message - the receive buffer in the backend should never fill up in practice. If that's deemed not good enough, we could modify just internal_flush() so that it uses secure_poll to wait for the possibility to either read or write, instead of blocking for just write. Whenever there's incoming data, read them into PqRecvBuffer for later processing, which keeps the OS input buffer from filling up. If PqRecvBuffer fills up, it can be extended, or we can start dropping old XLogRecPtr messages from it. In any case, we'll need something like pq_wait to check if a message can be read without blocking, but that's just a small additional function as opposed to a whole new API for assembling and sending messages without blocking. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers