Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

Tom Lane Sat, 11 Jan 2020 07:29:48 -0800

I wrote:
> Here's a draft patch that cleans up all the logic errors I could find.


So last night I was assuming that this problem just requires more careful
attention to what to return in the error exit paths.  In the light of
morning, though, I realize that the algorithms involved in
be-secure-gssapi.c and fe-secure-gssapi.c are just fundamentally wrong:

* On the read side, the code will keep looping until it gets a no-data
error from the underlying socket call.  This is silly.  In every or
almost every use, the caller's read length request corresponds to the
size of a buffer that's meant to be larger than typical messages, so
that betting that we're going to fill that buffer completely is the
wrong way to bet.  Meanwhile, it's fairly likely that the incoming
encrypted packet's length *does* correspond to some actual message
boundary; that would only not happen if the sender is forced to break
up a message, which ought to be a minority situation, else our buffer
size choices are too small.  So it's very likely that the looping just
results in doubling the number of read() calls that are made, with
half of them failing with EWOULDBLOCK.  What we should do instead is
return to the caller whenever we finish handing back the decrypted
contents of a packet.  We can do the read() on the next call, after
the caller's dealt with that data.

* On the write side, if the code encrypts some data and then gets
EWOULDBLOCK trying to write it, it will tell the caller that it
successfully wrote that data.  If that was all the data the caller
had to write (again, not so unlikely) this is a catastrophic
mistake, because the caller will be satisfied and will go to sleep,
rather than calling again to attempt another write.  What we *must*
do is to reflect the write failure verbatim whether or not we
encrypted some data.  We must remember how much data we encrypted
and then discount that much of the caller's supplied data next time.
There are hints in the existing comments that somebody understood
this at one point, but the code isn't acting that way today.

I expect that I can prove point B by hot-wiring pqsecure_raw_write
to randomly return EWOULDBLOCK (instead of making any write attempt)
every so often.  I think strace will be enough to confirm point A,
but haven't tried it yet.

                        regards, tom lane

Re: 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?)

Reply via email to