Robert Haas escribió: > On Mon, Sep 23, 2013 at 4:51 PM, Alvaro Herrera > <alvhe...@2ndquadrant.com> wrote: > > Here's an updated version; this mainly simplifies code, per comments > > from Andres (things were a bit too baroque in places due to the way the > > code had evolved, and I hadn't gone over it to simplify it). > > > > The only behavior change is that the renegotiation is requested 1kB > > before the limit is hit: the raise to 1% of the configured limit was > > removed. > > What basis do we have for thinking that 1kB is definitely enough to > avoid spurious disconnects?
I noticed that the "count" variable (which is what we use to determine when to start the renegotiation and eventually kill the connection) is only incremented when there's successful SSL transmission: it doesn't count low-level network transmission. If OpenSSL returns a WANT_READ or WANT_WRITE error code, that variable is not incremented. The number of bytes returned does not include network data transmitted only to satisfy the renegotiation. Sadly, with the OpenSSL codebase, there isn't much documented field experience to go by. Even something battle-tested such as Apache's mod_ssl gets this wrong; but apparently they don't care because their sessions are normally so short-lived that they don't get these problems. Also, I spent several days trying to understand the OpenSSL codebase to figure out how this works, and I think there might be bugs in there too, at least with nonblocking sockets. I wasn't able to reproduce an actual failure, though. Funnily enough, their own test utilities do not stress this area too much (at least the ones they include in their release tarballs). > (I have a bad feeling that you're going to say something along the > lines of "well, we tried it a bunch of times, and...".) Well, I did try a few times and saw no failure :-) I have heard about processes in production environments that are restarted periodically to avoid SSL failures which they blame on renegotiation. Some other guys have ssl_renegotiation_limit=0 because they know it causes network problems. I suggest we need to get this patch out there, so that they can test it; and if 1kB turns out not to be sufficient, we will have field experience including appropriate error messages on what is actually going on. (Right now, the error messages we get are complaining about completely the wrong thing.) I mean, if that 1kB limit is the only quarrel you have with this patch, I'm happy. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers