Hi,

On 2024-02-11 13:19:00 -0500, David Benjamin wrote:
> I've attached a patch for the master branch to fix up the custom BIOs used
> by PostgreSQL, in light of the issues with the OpenSSL update recently.
> While c82207a548db47623a2bfa2447babdaa630302b9 (switching from BIO_get_data
> to BIO_get_app_data) resolved the immediate conflict, I don't think it
> addressed the root cause, which is that PostgreSQL was mixing up two BIO
> implementations, each of which expected to be driving the BIO.

Yea, that's certainly not nice - and I think we've been somewhat lucky it
hasn't caused more issues. There's some nasty possibilities, e.g. sock_ctrl()
partially enabling ktls without our initialization having called
ktls_enable(). Right now that just means ktls is unusable, but it's not hard
to imagine accidentally ending up sending unencrypted data.



I've in the past looked into not using a custom BIO [1], but I have my doubts
about that being a good idea. I think medium term we want to be able to do
network IO asynchronously, which seems quite hard to do when using openssl's
socket BIO.



> Once we've done that, we're free to use BIO_set_data. While BIO_set_app_data
> works fine, I've reverted back to BIO_set_data because it's more commonly 
> used.
> app_data depends on OpenSSL's "ex_data" mechanism, which is a tad heavier 
> under
> the hood.

At first I was a bit wary of that, because it requires us to bring back the
fallback implementation. But you're right, it's noticeably heavier than
BIO_get_data(), and we do call BIO_get_app_data() fairly frequently.



> That leaves ctrl. ctrl is a bunch of operations (it's ioctl). The only
> operation that matters is BIO_CTRL_FLUSH, which is implemented as a no-op. All
> other operations are unused. It's once again good that they're unused because
> otherwise OpenSSL might mess with postgres's socket, break the assumptions
> around interrupt handling, etc.

How did you determine that only FLUSH is required? I didn't even really find
documentation about what the intended semantics actually are.

E.g. should we implement BIO_CTRL_EOF? Sure, it wasn't really supported so
far, because we never set it, but is that right? What about
BIO_CTRL_GET_CLOSE/BIO_CTRL_SET_CLOSE?


Another issue is that 0 doesn't actually seem like the universal error return
- e.g. BIO_C_GET_FD seems to return -1, because 0 is a valid fd.


As of your patch the bio doesn't actually have an FD anymore, should it still
set BIO_TYPE_DESCRIPTOR?



> +static long
> +my_sock_ctrl(BIO *h, int cmd, long num, void *ptr)
> +{
> +     long            res = 0;
> +
> +     switch (cmd)
> +     {
> +             case BIO_CTRL_FLUSH:
> +                     /* libssl expects all BIOs to support BIO_flush. */
> +                     res = 1;
> +                     break;
> +     }
> +
> +     return res;
> +}

I'd move the res = 0 into a default: block. That way the compiler can warn if
some case doesn't set it in all branches.


>  static BIO_METHOD *
>  my_BIO_s_socket(void)
>  {

Wonder if we should rename this. It's pretty odd that we still call it's not
really related to s_socket anymore, and doesn't even really implement the same
interface (e.g. get_fd doesn't work anymore).  Similarly, my_SSL_set_fd()
doesn't actually call set_fd() anymore, which sure seems odd.


Greetings,

Andres Freund

[1] 
https://www.postgresql.org/message-id/20210715021747.h2hih7mw56ivyntt%40alap3.anarazel.de


Reply via email to