Re: libpq: Process buffered SSL read bytes to support records >8kB on async API

Andres Freund Wed, 16 Jul 2025 14:34:49 -0700

Hi,

On 2025-07-16 11:50:46 -0700, Jacob Champion wrote:
> On Wed, Jul 16, 2025 at 11:11 AM Andres Freund <and...@anarazel.de> wrote:
> > If one modifies libpq to use openssl readahead (which does result in 
> > speedups,
> > because otherwise openssl think it's useful to do lots of 5 byte reads from
> > the socket), I see occasional hangs in libpq.
>
> Now that is a very interesting coincidence, because the patch I'm
> working on is about to add an assertion that readahead is turned
> *off*. :D


Heh.


> Based on my understanding of [1], readahead makes this overall problem
> much worse by opportunistically slurping bytes off the wire and doing
> absolutely nothing with them until you call SSL_read() enough times to
> finally get to them.

Right - but it also substantially reduces the number of syscalls :(. I'm not
sure that it's wise to close that door permanently. Clearly it's not something
we need to care about in the back branches.

Without it openssl will do one read for the record length, and another read
for the record contents. I.e. it'll often double the number of syscalls.


Without readahead:

perf stat -e raw_syscalls:sys_enter,cycles,context-switches 
src/bin/pgbench/pgbench -c $c -j$j -M prepared -n -P 1 -f <(echo 'select 1') -h 
localhost -t1000

             5,561      raw_syscalls:sys_enter
        89,125,976      cycles
             1,009      context-switches


With readahead:

             4,556      raw_syscalls:sys_enter
        83,582,571      cycles
             1,007      context-switches



> It would even hide those bytes from the pending
> count:
>
> > Therefore, it is possible for no more bytes to be readable from the
> > underlying BIO (because OpenSSL has already read them) and for
> > SSL_pending() to return 0, even though readable application data bytes
> > are available (because the data is in unprocessed buffered records).

It's rather depressing, like how this this API behaviour came to be?


Luckily there seems to be SSL_has_pending():

> SSL_has_pending() returns 1 if s has buffered data (whether processed or
> unprocessed) and 0 otherwise

And it seems old enough:

> The SSL_has_pending() function was added in OpenSSL 1.1.0.


> > It does seem ... weird ... that pqGetCopyData3() just processes whatever is
> > already in the libpq buffer and then waits for the socket to become 
> > readable,
> > before ever calling pqReadData(). What guarantees, even without the change
> > above, that a) there's no pending data inside openssl b) that we're not
> > waiting for a write?
>
> (a) is the bug discussed here, AFAICT?

I'm not entirely sure, tbh.


> > If I make pqGetCopyData3() call pqReadData() before the pqWait() and 
> > continue
> > if it returns 1, the hang vanishes.
>
> Sure, but that's cheating. I think I'd expect that adding extra
> "unnecessary" reads would help cover up the problem that readahead has
> caused.

To me the pattern in our code seems bogus, even without readahead, no?

getCopyDataMessage() can return 0 because our internal buffer is empty,
without ever reading from the socket or openssl. But if the SSL record
contained two messages (or parts of two messages), all the required data may
be in openssl. In which case the pqWait() that pqGetCopyData3() does will wait
forever.  That's a problem unrelated to readahead, no?

I.e. the extra read actually ensures that we're not doing a pqWait() without
knowing whether we need one.

Greetings,

Andres Freund

Re: libpq: Process buffered SSL read bytes to support records >8kB on async API

Reply via email to