Nelson B wrote:
Rob Crittenden wrote:
I'm having an issue with mod_nss, an Apache module I wrote that provides SSL using NSS.

The way Apache loads modules is a tad strange.

I'd say it's more than a tad!

What it does is it loads them one time in order to get its list of
configuration directives and it verifies that the configuration is ok. It
also runs through the initialization routines.

In my case this is needed so I still have stdin/stdout and can prompt for the PIN.

Can't you use /dev/tty instead of stdin/stdout?

I suppose so, I'll look into that for the future but it would require a fair bit of restructuring.


Once the first round of module loading is done stdin/out/err are all closed.

Apache then unloads the module, then reloads it again.

I'm not really asking you to defend any of that, but ... do you have any
idea what (if anything) is the benefit of that strategy?

Each module can provide its own set of configuration directives. Apache needs to figure out what those are so it loads the module to find them. I'm not entirely sure why it runs the init and shutdown though. Perhaps for the stdin/stdout problem.


I had to write a fair bit of code to handle this, in particular because NSS
needs to shut down gracefully otherwise it won't start up again once the
module gets loaded for the 2nd time.

Yes, that's not uncommon.  Opens need to be closed, mallocs need to be
freed (as you know :).

The specific problem I'm having is with the NSS session cache.

You know that NSS has (potentially) two separate session caches, the
client cace (of sessions which NSS began by acting as an SSL client), and
the server session cache (of sessions for which NSS acted as the server).
They're managed entirely separately.

Yes, the current problem is the server cache.


I periodically get a core dump in the LockPoller thread (sslsnce.c).

In this case, it is apparent that you're discussing the server cache.

The server cache can operate in either of two modes: single process or
multiple process.  There are separate server cache initialization functions,
one for single process server caches, and another for multi-process server
caches (it has the letters MP in the name).

In the latter case, the server session cache is kept in shared memory,
so that multiple processes can all share a single common pool of sessions.
One process creates the cache in unnamed (a.k.a. "anonymous") shared
memory and the other processes that share it must all inherit it as
children of the process that created it.  The parent process also creates
a thread to watch the shared memory, to deal with the possibility that
one of the children might die while holding a lock in that shared memory.

Right, I have to handle both cases. Apache has a multi-process and a multi-threaded mode.


On some unix/linux systems, there is a problem if the parent has multiple
threads going when it forks a child.  So we need to have a way to stop
the lock poller thread prior to fork, and start it up again in the parent
after the fork.  See https://bugzilla.mozilla.org/show_bug.cgi?id=339466

Yes, this bug sounds like my problem.


The cache is disappearing underneath the thread and bad things happen. It's basically a race condition to see if this thread can exit before its data disappears.

The lock poller thread definitely needs to be stopped in the parent
process before unloading the module.

However, there is an issue here regarding inheritance.  It would be
wrong to create the shared memory cache in a parent process, fork the
children, then shutdown the shared memory cache and create another one.
So, this raises numerous questions and ideas.  I'll start with one question.

Q1) Is this a multi-process shared server cache installation?
    Do you intend to operate the shared memory session cache?
    If not, the easiest solution is for you to use the single proecss
    server session cache.

It is multi-process (we're talking Apache here) and I don't really have a choice in the matter.


A potential fix I have is to not initialize the cache during the first module load.

Seems to me that your options might include:
a) do NOTHING on the first module load, and do everything on the second
   module load.
b) do all the work on the first module load.  Don't really shut down the
   module after the first load (that is, pretend that you shut down).
   Then do nothing on the second load, and continue to use the stuff loaded
   the first time.
c) really startup and shutdown everything all the way twice.

I'm doing c) now and it's crashing, unless I've missed something. It definitely seems like a timing bug because it doesn't crash all the time and tends to crash more on slower machines than faster ones.

I don't have the choice with b). Apache forcibly unloads the module (dlcose()). If I haven't shut things down right I'll get a SEC_ERROR_BUSY error during subsequent reloads.

I can't do a) because of the stdin/stdout problem. I need to get the PIN to authenticate the database. I can't think of anything at the moment but I wonder if it would be adventageous to catch certain config problems early in the process.


The questions in my mind are: when (if ever) does the process doing this
fork the children that will share the session cache?
Is it during the first time the module is loaded?
or between the first and second time?
or after the second time?
or is this only in the children processes?   or ??
or are there perhaps no children involved?

It is my understanding that the fork happens after all the modules are loaded and initialized for the second time.


The answers to those questions would be expected to suggest how best to
handle all this module loading/unloading.

I've always been under the impression that initializing the cache is one of the things one should do in an NSS app and I don't want to introduce other, worse side-effects.

A server app should definitely initialize the server session cache before
starting to accept any SSL/TLS connections.  When multiple processes are
all accepting connections on a shared port, they should also all share
the server session cache for that port.  The cache has to be initialized
before the child processes are forked so that they will inherit it,
because inheritance is the only way to get the shared memory cache.

Right. So what I am think is that I'll delay the cache initialization until the "final" module load. As long as NOT initializing the cache won't have any bad side effects I'll be ok. I can be certain that Apache won't be accepting any connections during the first load.

To complicate matters, PKCS#11 modules should NOT be loaded when a process
forks.  Each child process must initialize its PKCS#11 modules for itself.

So, for multi-process shared-memory servers, the parent should create the
server session cache and create the listen socket(s), THEN spawn the
children, which will inherit the shared cache and shared listen sockets,
and then finally it (and each of the children) should initialize the
PKCS#11 module for itself.  Or, if the parent will not also act as a server,
but only the children will accept connections on sockets, the parent can
skip initializating the PKCS#11 module alltogther.

This means that initializing the NSS/SSL server session cache takes place
BEFORE calling NSS_Init (or any of its variants).  Many find this surprising.

The "selfserv" demo server app does all this correctly, IINM.

Interesting. I'm doing it wrong then. I'll take a look at selfserv and switch things around as appropriate. Who knows, maybe this will help the cache shutdown :-)


Assuming that is ok, is it bad to call SSL_ShutdownServerSessionIDCache() if the cache hasn't been initialized?

No, that should be harmless, as long as you don't have multiple threads
trying to do this simultaneously. Of course, it would be a programming
error to try to do this in multiple threads simultaneously.

I briefly looked at the code and it seems ok to me but I don't want to make assumptions.

It is intended to be safe to close the global server cache instance even
if it has not been initialized.  If it should prove not to be safe, that
would be an NSS bug.

thanks

Hope this helps.
I have a feeling I've just made the problem seem bigger :)

Not really, but as expected you raised a number of things for me to think about :-)

For now I'm going to the deferred cache initialization. It's the safest, easiest (and quickest) solution for the moment.

Thanks

rob
_______________________________________________
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto

Reply via email to