I wrote: > Here's a draft patch that cleans up all the logic errors I could find.
So last night I was assuming that this problem just requires more careful attention to what to return in the error exit paths. In the light of morning, though, I realize that the algorithms involved in be-secure-gssapi.c and fe-secure-gssapi.c are just fundamentally wrong: * On the read side, the code will keep looping until it gets a no-data error from the underlying socket call. This is silly. In every or almost every use, the caller's read length request corresponds to the size of a buffer that's meant to be larger than typical messages, so that betting that we're going to fill that buffer completely is the wrong way to bet. Meanwhile, it's fairly likely that the incoming encrypted packet's length *does* correspond to some actual message boundary; that would only not happen if the sender is forced to break up a message, which ought to be a minority situation, else our buffer size choices are too small. So it's very likely that the looping just results in doubling the number of read() calls that are made, with half of them failing with EWOULDBLOCK. What we should do instead is return to the caller whenever we finish handing back the decrypted contents of a packet. We can do the read() on the next call, after the caller's dealt with that data. * On the write side, if the code encrypts some data and then gets EWOULDBLOCK trying to write it, it will tell the caller that it successfully wrote that data. If that was all the data the caller had to write (again, not so unlikely) this is a catastrophic mistake, because the caller will be satisfied and will go to sleep, rather than calling again to attempt another write. What we *must* do is to reflect the write failure verbatim whether or not we encrypted some data. We must remember how much data we encrypted and then discount that much of the caller's supplied data next time. There are hints in the existing comments that somebody understood this at one point, but the code isn't acting that way today. I expect that I can prove point B by hot-wiring pqsecure_raw_write to randomly return EWOULDBLOCK (instead of making any write attempt) every so often. I think strace will be enough to confirm point A, but haven't tried it yet. regards, tom lane