On Wed, 22 Feb 2023 at 07:27, Heikki Linnakangas <hlinn...@iki.fi> wrote: > > On 20/01/2023 03:28, Jacob Champion wrote: > > On Wed, Jan 18, 2023 at 7:16 PM Greg Stark <st...@mit.edu> wrote: > >> * "Service Mesh" type tools that hide multiple services behind a > >> single host/port ("Service Mesh" is just a new buzzword for "proxy"). > > > > If you want to multiplex protocols on a port, now is an excellent time > > to require clients to use ALPN on implicit-TLS connections. (There are > > no clients that can currently connect via implicit TLS, so you'll > > never have another chance to force the issue without breaking > > backwards compatibility.) That should hopefully make it harder to > > ALPACA yourself or others [2]. > > Good idea. Do we want to just require the protocol to be "postgres", or > perhaps "postgres/3.0"? Need to register that with IANA, I guess.
I had never heard of this before, it does seem useful. But if I understand it right it's entirely independent of this patch. We can add it to all our Client/Server exchanges whether they're the initial direct SSL connection or the STARTTLS negotiation? > We implemented a protocol version negotiation mechanism in the libpq > protocol itself, how would this interact with it? If it's just > "postgres", then I guess we'd still negotiate the protocol version and > list of extensions after the TLS handshake. > > >> Incidentally I find the logic in ProcessStartupPacket incredibly > >> confusing. It took me a while before I realized it's using tail > >> recursion to implement the startup logic. I think it would be way more > >> straightforward and extensible if it used a much more common iterative > >> style. I think it would make it possible to keep more state than just > >> ssl_done and gss_done without changing the function signature every > >> time for example. > > > > +1. The complexity of the startup logic, both client- and server-side, > > is a big reason why I want implicit TLS in the first place. That way, > > bugs in that code can't be exploited before the TLS handshake > > completes. > > +1. We need to support explicit TLS for a long time, so we can't > simplify by just removing it. But let's refactor the code somehow, to > make it more clear. > > Looking at the patch, I think it accepts an SSLRequest packet even if > implicit TLS has already been established. That's surely wrong, and > shows how confusing the code is. (Or I'm reading it incorrectly, which > also shows how confusing it is :-) ) I'll double check it but I think I tested that that wasn't the case. I think it accepts the SSL request packet and sends back an N which the client libpq just interprets as the server not supporting SSL and does an unencrypted connection (which is tunneled over stunnel unbeknownst to libpq). I agree I would want to flatten this logic to an iterative approach but having wrapped my head around it now I'm not necessarily rushing to do it now. The main advantage of flattening it would be to make it easy to support other protocol types which I think could be really interesting. It would be much clearer to document the state machine if all the state is in one place and the code just loops through processing startup packets and going to a new state until the connection is established. That's true now but you have to understand how the state is passed in the function parameters and notice that all the recursion is tail recursive (I think). And extending that state would require extending the function signature which would get awkward quickly. > Regarding Vladimir's comments on how clients can migrate to this, I > don't have any great suggestions. To summarize, there are several options: > > - Add an "fast_tls" option that the user can enable if they know the > server supports it > > - First connect in old-fashioned way, and remember the server version. > Later, if you reconnect to the same server, use implicit TLS if the > server version was high enough. This would be most useful for connection > pools. Vladimir pointed out that this doesn't necessarily work. The server may be new enough to support it but it could be behind a proxy like pgbouncer or something. The same would be true if the server reported a "connection option" instead of depending on version. > - Connect both ways at the same time, and continue with the fastest, > i.e. "happy eyeballs" That seems way too complex for us to bother with imho. > - Try implicit TLS first, and fall back to explicit TLS if it fails. > For libpq, we don't necessarily need to do anything right now. We can > add the implicit TLS support in a later version. Not having libpq > support makes it hard to test the server codepath, though. Maybe just > test it with 'stunnel' or 'openssl s_client'. I think we should have an option to explicitly enable it in psql, if only for testing. And then wait five years and switch the default on it then. In the meantime users can just set it based on their setup. That's not the way to the quickest adoption but imho the main advantages of this option are the options it gives users, not the latency improvement, so I'm not actually super concerned about adoption rate. I assume we'll keep the negotiated mode indefinitely because it can handle any other protocols we might want. For instance, it currently handles GSSAPI -- which raises the question, are we happy with GSSAPI having this extra round trip? Is there a similar change we could make for it? My understanding is that GSSAPI is an abstract interface and the actual protocol it's invoking could be anything so we can't make any assumptions about what the first packet looks like. Perhaps we can do something about pipelining GSSAPI messages so if the negotiation fails the server just closes the connection but if it accepts it it does a similar trick with unreading the buffered data and processing it through the GSSAPI calls. -- greg