Re: Support for NSS as a libpq TLS backend

Andres Freund Wed, 26 Jan 2022 15:59:59 -0800

Hi,

On 2022-01-26 21:39:16 +0100, Daniel Gustafsson wrote:
> > What about
> > reconfiguring (note: add --enable-depend) the linux tasks to build against
> > nss, and then run the relevant subset of tests with it?  Most tests don't 
> > use
> > tcp / SSL anyway, so rerunning a small subset of tests should be feasible?
> 
> That's an interesting idea, I think that could work and be reasonably readable
> at the same time (and won't require in-depth knowledge of Cirrus).  As it's 
> the
> same task it does spend more time towards the max runtime per task, but that's
> not a problem for now.  It's worth keeping in mind though if we deem this to 
> be
> a way forward with testing multiple settings.


I think it's a way for a limited number of settings, that each only require a
limited amount of tests... Rerunning all tests etc is a different story.



> > Is it a great idea to have common/nss.h when there's a library header nss.h?
> > Perhaps we should have a pg_ssl_{nss,openssl}.h or such?
> 
> That's a good point, I modelled it after common/openssl.h but I agree it's
> better to differentiate the filenames.  I've renamed it to common/pg_nss.h and
> we should IMO rename common/openssl.h regardless of what happens to this 
> patch.

+1


> > Does this make some things notably more expensive? Presumably it does 
> > remove a
> > bunch of COW opportunities, but likely that's not a huge factor compared to
> > assymetric crypto negotiation...
> 
> Right, the context of setting up crypto across a network connection it's 
> highly
> likely to drown out the costs.

If you start to need to run a helper to decrypt an encrypted private key, and
do all the initialization, I'm not sure sure that holds true anymore... Have
you done any connection speed tests? pgbench -C is helpful for that.


> > Maybe soem of this commentary should migrate to the file header or such?
> 
> Maybe, or perhaps README.ssl?  Not sure where it would be most reasonable to
> keep it such that it's also kept up to date.

Either would work for me.


> >> This introduce
> >> + * differences with the OpenSSL support where some errors are only 
> >> reported
> >> + * at runtime with NSS where they are reported at startup with OpenSSL.
> > 
> > Found this sentence hard to parse somehow.
> > 
> > It seems pretty unfriendly to only have minimal error checking at postmaster
> > startup time. Seems at least the presence and usability of keys should be 
> > done
> > *also* at that time?
> 
> I'll look at adding some setup, and subsequent teardown, of NSS at startup
> during which we could do checking to be more on par with how the OpenSSL
> backend will report errors.

Cool.


> >> +/*
> >> + * raw_subject_common_name
> >> + *
> >> + * Returns the Subject Common Name for the given certificate as a raw char
> >> + * buffer (that is, without any form of escaping for unprintable 
> >> characters or
> >> + * embedded nulls), with the length of the buffer returned in the len 
> >> param.
> >> + * The buffer is allocated in the TopMemoryContext and is given a NULL
> >> + * terminator so that callers are safe to call strlen() on it.
> >> + *
> >> + * This is used instead of CERT_GetCommonName(), which always performs 
> >> quoting
> >> + * and/or escaping. NSS doesn't appear to give us a way to easily 
> >> unescape the
> >> + * result, and we need to store the raw CN into port->peer_cn for 
> >> compatibility
> >> + * with the OpenSSL implementation.
> >> + */
> > 
> > Do we have a testcase for embedded NULLs in common names?
> 
> We don't, neither for OpenSSL or NSS.  AFAICR Jacob spent days trying to get a
> certificate generation to include an embedded NULL byte but in the end gave 
> up.
> We would have to write our own tools for generating certificates to add that
> (which may or may not be a bad idea, but it hasn't been done).

Hah, that's interesting.


> >> +/*
> >> + * pg_SSLShutdownFunc
> >> + *                Callback for NSS shutdown
> >> + *
> >> + * If NSS is terminated from the outside when the connection is still in 
> >> use
> > 
> > What does "NSS is terminated from the outside when the connection" really
> > mean? Does this mean the client initiating something?
> 
> If an extension, or other server-loaded code, interfered with NSS and managed
> to close contexts in order to interfere with connections this would ensure us
> closing it down cleanly.
> 
> That being said, I was now unable to get my old testcase working so I've for
> now removed this callback from the patch until I can work out if we can make
> proper use of it.  AFAICS other mature NSS implementations aren't using it
> (OpenLDAP did in the past but have since removed it, will look at how/why).

I think that'd be elog(FATAL) time if we want to do anything (after changing
state so that no data is sent to client).


> >> +                          /*
> >> +                           * The error cases for PR_Recv are not 
> >> documented, but can be
> >> +                           * reverse engineered from 
> >> _MD_unix_map_default_error() in the
> >> +                           * NSPR code, defined in 
> >> pr/src/md/unix/unix_errors.c.
> >> +                           */
> > 
> > Can we propose a patch to document them? Don't want to get bitten by this
> > suddenly changing...
> 
> I can certainly propose something on their mailinglist, but I unfortunately
> wouldn't get my hopes up too high as NSS and documentation aren't exactly best
> friends (the in-tree docs doesn't cover the API and Mozilla recently removed
> most of the online docs in their neverending developer site reorg).

Kinda makes me question the wisdom of starting to depend on NSS. When openssl
docs are vastly outshining a library's, that library really should start to
ask itself some hard questions.



> In order to find a good split I think we need to figure what to optimize for;
> do we optimize for ease of reverting should that be needed, or along
> functionality borders, or something else?  I don't have good ideas here, but a
> single 7596 insertions(+), 421 deletions(-) commit is clearly not a good idea.

I think the goal should be the ability to incrementally commit.


> Stephen had an idea off-list that we could look at splitting this across the
> server/client boundary, which I think is the only idea I've so far which has
> legs. (The first to go in would come with the common code of course.)

Yea, that's the most obvious one. I suspect client-side has a lower
complexity, because it doesn't need to replace quite as many things?


> The attached v53 incorporates the fixes discussed above, and builds green for
> both OpenSSL and NSS in Cirrus on my Github repo (thanks again for your work 
> on
> those files) so it will be interesting to see the CFBot running them.

Looks like that worked...

Greetings,

Andres Freund

Re: Support for NSS as a libpq TLS backend

Reply via email to