Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

Jan Urbański Thu, 12 Feb 2015 01:23:11 -0800

Andres Freund writes:

> On 2015-02-09 18:17:14 +0100, Jan Urbański wrote:
>> First of all, the current behaviour is crazy. We're setting and unsetting the
>> locking callback every time a connection is made/closed, which is not how
>> OpenSSL is supposed to be used. The *application* using libpq should set a
>> callback before it starts threads, it's no business of the library's.
>
> I don't buy this argument at all. Delivering a libpq that's not
> threadsafe in some circumstances is a really bad idea.


I knew this would be a hard sell :( What I know is that the current situation
is not good and leaving it as-is causes real grief.

>> The old behaviour was slightly less insane (set callbacks first time we're
>> engaging OpenSSL code, never touch them again). The real sane solution is to
>> leave it up to the application.
>
> The real solution would be for openssl to do it itself.

I think that the OpenSSL API is the real culprit there, requiring threads to
register a callback is what's causing all the issues, but I don't think this
will get ever changed.

>> Note that the PHP pgsql extension doesn't set the OpenSSL callbacks, because
>> libpq was setting them on its own. If we remove the callback handling from
>> libpq, PHP will need to add them. By the way, the MySQL extension for PHP 
>> also
>> does not set those callbacks.
>
> I think this shows pretty clearly how insane it would be to change this
> in a minor release. Do you really expect people to update libpq and php
> in unison. After a minor release?

Well, I haven't found reports of threaded PHP+MySQL+SSL programs crashing, and
the MySQL extension also doesn't care about the callbacks. I think it's because
it's both because it's a rare combination, or because almost everyone loads the
cURL extension, which *does* set up callbacks. Like I said, it's not a good
situation.

> Note that we *already* provide applications with the ability to set the
> callbacks themselves and prevent us from doing so: PQinitSSL(false).

Ah, I only now saw that with PQinitOpenSSL(true, false) you can work around the
problem, if you set up your own callbacks first. That seems to at least provide
a way to make libpq not do the callbacks dance in released versions. Thank you.

>> Let me reiterate: I now believe the callbacks should be set by the 
>> application,
>> libraries should not touch them, since they don't know what will they be
>> stomping on. If the application is run through a VM like Python or PHP, it's
>> the VM that should make sure the callbacks are set.
>
> I fail to see why it's any better to do it in the VM. That relies on
> either always loading the VM's openssl module, even if you don't
> actually need it because the library you use deals with openssl. It
> prevents other implementations of openssl in the VM.

The callbacks are a thing that's owned by the application, so libraries have no
business in setting them up. The way OpenSSL's API is specified (very
unfortunately IMHO) is that before you use OpenSSL from threads, you need to
set the callbacks. If you don't control your application's startup (like when
you're a script executed through a VM), you assume the VM took care of it at
startup. If you're a library, you assume the user took care of it, as they
should.

Now I know that a lot of apps get this wrong. Python almost does the right
thing, but you're right, it only sets up callbacks if you load the 'ssl'
module. It still feels like setting them up in library code is stomping on
things not owned by the library - like libperl setting up signal handlers,
which caused problems in Postgres. They're resources not owned by the library!

> What I think we should do is the following:
>
> * Emphasize the fact that it's important to use PQinitSSL(false) if you
>   did things yourself.

PQinitOpenSSL(true, false) if anything. The reason that function got introduced
is precisely because PQinitSSL(false) isn't exactly right, see
http://www.postgresql.org/message-id/[email protected]

That doesn't solve the problem of the Python deadlock, where you're not at
leisure to call a C function at the beginning of your module.

> * If there's already callbacks set: Remember that fact and don't
>   overwrite. In the next major version: warn.

So yeah, that was my initial approach - check if callbacks are set, don't do
the dance if they are. It felt like a crutch, though, and racy at that. There's
no atomic way to test-and-set those callbacks. The window for racyness is
small, though.

> * Assert or something if the callback when unregistering isn't the same
>   as it was when registering. That's pretty much guaranteed to cause
>   issues.

So let me try another proposal and see if it doesn't set alarm bells off.

  * When opening a connection, if there's a callback set, don't overwrite it
    (small race window).

  * When closing a connection, if the callback set is not
    pq_lockingcallback/pq_threadidcallback, don't NULL it (small race window)

Asserts in frontend code are impossible, right? There's no way to warn, either.

That would be a ~8 line patch. Does it feel back-patcheable?

Cheers,
Jan


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

Reply via email to