Hi, On 2022-01-26 21:39:16 +0100, Daniel Gustafsson wrote: > > What about > > reconfiguring (note: add --enable-depend) the linux tasks to build against > > nss, and then run the relevant subset of tests with it? Most tests don't > > use > > tcp / SSL anyway, so rerunning a small subset of tests should be feasible? > > That's an interesting idea, I think that could work and be reasonably readable > at the same time (and won't require in-depth knowledge of Cirrus). As it's > the > same task it does spend more time towards the max runtime per task, but that's > not a problem for now. It's worth keeping in mind though if we deem this to > be > a way forward with testing multiple settings.
I think it's a way for a limited number of settings, that each only require a limited amount of tests... Rerunning all tests etc is a different story. > > Is it a great idea to have common/nss.h when there's a library header nss.h? > > Perhaps we should have a pg_ssl_{nss,openssl}.h or such? > > That's a good point, I modelled it after common/openssl.h but I agree it's > better to differentiate the filenames. I've renamed it to common/pg_nss.h and > we should IMO rename common/openssl.h regardless of what happens to this > patch. +1 > > Does this make some things notably more expensive? Presumably it does > > remove a > > bunch of COW opportunities, but likely that's not a huge factor compared to > > assymetric crypto negotiation... > > Right, the context of setting up crypto across a network connection it's > highly > likely to drown out the costs. If you start to need to run a helper to decrypt an encrypted private key, and do all the initialization, I'm not sure sure that holds true anymore... Have you done any connection speed tests? pgbench -C is helpful for that. > > Maybe soem of this commentary should migrate to the file header or such? > > Maybe, or perhaps README.ssl? Not sure where it would be most reasonable to > keep it such that it's also kept up to date. Either would work for me. > >> This introduce > >> + * differences with the OpenSSL support where some errors are only > >> reported > >> + * at runtime with NSS where they are reported at startup with OpenSSL. > > > > Found this sentence hard to parse somehow. > > > > It seems pretty unfriendly to only have minimal error checking at postmaster > > startup time. Seems at least the presence and usability of keys should be > > done > > *also* at that time? > > I'll look at adding some setup, and subsequent teardown, of NSS at startup > during which we could do checking to be more on par with how the OpenSSL > backend will report errors. Cool. > >> +/* > >> + * raw_subject_common_name > >> + * > >> + * Returns the Subject Common Name for the given certificate as a raw char > >> + * buffer (that is, without any form of escaping for unprintable > >> characters or > >> + * embedded nulls), with the length of the buffer returned in the len > >> param. > >> + * The buffer is allocated in the TopMemoryContext and is given a NULL > >> + * terminator so that callers are safe to call strlen() on it. > >> + * > >> + * This is used instead of CERT_GetCommonName(), which always performs > >> quoting > >> + * and/or escaping. NSS doesn't appear to give us a way to easily > >> unescape the > >> + * result, and we need to store the raw CN into port->peer_cn for > >> compatibility > >> + * with the OpenSSL implementation. > >> + */ > > > > Do we have a testcase for embedded NULLs in common names? > > We don't, neither for OpenSSL or NSS. AFAICR Jacob spent days trying to get a > certificate generation to include an embedded NULL byte but in the end gave > up. > We would have to write our own tools for generating certificates to add that > (which may or may not be a bad idea, but it hasn't been done). Hah, that's interesting. > >> +/* > >> + * pg_SSLShutdownFunc > >> + * Callback for NSS shutdown > >> + * > >> + * If NSS is terminated from the outside when the connection is still in > >> use > > > > What does "NSS is terminated from the outside when the connection" really > > mean? Does this mean the client initiating something? > > If an extension, or other server-loaded code, interfered with NSS and managed > to close contexts in order to interfere with connections this would ensure us > closing it down cleanly. > > That being said, I was now unable to get my old testcase working so I've for > now removed this callback from the patch until I can work out if we can make > proper use of it. AFAICS other mature NSS implementations aren't using it > (OpenLDAP did in the past but have since removed it, will look at how/why). I think that'd be elog(FATAL) time if we want to do anything (after changing state so that no data is sent to client). > >> + /* > >> + * The error cases for PR_Recv are not > >> documented, but can be > >> + * reverse engineered from > >> _MD_unix_map_default_error() in the > >> + * NSPR code, defined in > >> pr/src/md/unix/unix_errors.c. > >> + */ > > > > Can we propose a patch to document them? Don't want to get bitten by this > > suddenly changing... > > I can certainly propose something on their mailinglist, but I unfortunately > wouldn't get my hopes up too high as NSS and documentation aren't exactly best > friends (the in-tree docs doesn't cover the API and Mozilla recently removed > most of the online docs in their neverending developer site reorg). Kinda makes me question the wisdom of starting to depend on NSS. When openssl docs are vastly outshining a library's, that library really should start to ask itself some hard questions. > In order to find a good split I think we need to figure what to optimize for; > do we optimize for ease of reverting should that be needed, or along > functionality borders, or something else? I don't have good ideas here, but a > single 7596 insertions(+), 421 deletions(-) commit is clearly not a good idea. I think the goal should be the ability to incrementally commit. > Stephen had an idea off-list that we could look at splitting this across the > server/client boundary, which I think is the only idea I've so far which has > legs. (The first to go in would come with the common code of course.) Yea, that's the most obvious one. I suspect client-side has a lower complexity, because it doesn't need to replace quite as many things? > The attached v53 incorporates the fixes discussed above, and builds green for > both OpenSSL and NSS in Cirrus on my Github repo (thanks again for your work > on > those files) so it will be interesting to see the CFBot running them. Looks like that worked... Greetings, Andres Freund