Hi -- I've been spending a lot of time staring at mod_dbd and have created four sequential patches for trunk:
http://people.apache.org/~chrisd/patches/mod_dbd_pools_groups/ I've tested each one in both the threads and no-threads cases, making sure it compiles and runs with mod_authn_dbd. I've also tested the cumulative patch fairly extensively, using both ap_dbd_acquire() and ap_dbd_open()/ap_dbd_close() simultaneously (in different threads). I checked the threads and no-threads cases, with DBDPersist on and off for each, making sure all types of connections were always closed at the appropriate time. I also forced dbd_setup() to fail in the child_init hook to test the code path where dbd_setup() is called later after acquiring a mutex. Would other folks mind taking a these out for a trial run? One thing I'd be interested to know is whether multiple drivers work OK; with the cumulative patch it's possible to point one set of hosts to one kind of DB and another set to a different DB. Do the APR DBD drivers play well with each other if they're loaded into the same process? Another pair of questions I have relate to the netware and beos MPMs and their handling of child_init hook functions and memory pools. Skim down to the "***" for details. Here's a rundown of the changes for the four patch files: First, a pile of cleanups, including some function and variable renaming. This is primarily for the sake of readability and clarity (and my sanity). Second, some miscellaneous fixes and changes, including: - Using ap_log_error() or ap_log_rerror() instead of ap_log_perror(), wherever possible, since without a server_rec log_error_core() ignores messages below the default server log level. - In the no-threads versions of ap_dbd_[c]acquire(), in the non- persistent case, cleanups are registered passing svr->conn as the data; but that's NULL since it's the non-persistent case. - In the no-threads persistent case, ap_dbd_open() makes a call to apr_dbd_error() passing arec->driver and arec->handle if apr_dbd_check_conn() fails, but arec/rec is NULL; it should be svr->conn->driver and svr->conn->handle. - Eliminating code duplication between the threads and no-threads cases. - Including apr_lib.h and using apr_isdigit() instead of including ctype.h and using isdigit() directly. - Including apr_want.h and using APR_WANT_MEMFUNC and APR_WANT_STRFUNC. Third, a fairly major change to deal with the problems identified in PR #39985: http://issues.apache.org/bugzilla/show_bug.cgi?id=39985 One issue raised by this bug report is that in the non-threaded persistent case, the DB connection is opened using s->process->pool and then dbd_close() is registered as a cleanup on that pool. However, the MPMs never call apr_pool_destroy() on that pool, so the DB connections aren't closed properly on shutdown. The larger issue raised in the report is that in the threaded, persistent case, segfaults could occur during shutdown. This is ultimately due to a second call to apr_pool_destroy() for a pool that was previously destroyed. More generally, it is due to the fact that there's no easy way to use reslists if you plan to create a sub-pool in the constructor. But, creating a sub-pool is often the right thing to do. This patch works around the problem by passing to apr_reslist_create() a pool for which apr_pool_destroy() is never called. For more details, see: http://marc.theaimsgroup.com/?l=apr-dev&m=116683262613500&w=2 It turns out that the child processes created by MPMs call apr_pool_destroy() on their pchild pools, and this is the pool passed to the child_init hook functions. Currently dbd_setup_init() passes this pool to dbd_setup(), which creates a sub-pool and passes that to apr_reslist_create(). However, luckily, it turns out that s->process->pool, which is created in the main process, is not destroyed when a child process exits. This means that we can use it to create the reslist, and the reslist's internal cleanup function will never be called. Instead, we explicitly register apr_reslist_destroy() as a cleanup on a sub-pool of pchild which is created in dbd_setup_init(). At shutdown, this cleanup takes care of closing all the DB connections, without causing a double-free situation. (*** As a sidebar issue, it looks to me as though the netware MPM doesn't call apr_pool_destroy() on the pmain child it creates and passes to the child_init hook functions. Also, the beos MPM doesn't seem to call the child_init hook functions at all, because it only creates threads. I know nothing about either platform, but I'm making the assumption here that these are both things which should be fixed. If that's a bad assumption, please holler!) On advice from Nick Kew, I revised dbd_construct() so that it allocates its ap_dbd_t structure out of a sub-pool; this means the reslist won't leak memory over time. The dbd_close() is registered as a cleanup on the sub-pool, and it simply closes the DB connection. Also on Nick's advice, the constructor creates a sub-sub-pool for use by apr_dbd_prepare(). Now, dbd_destruct() can simply call apr_pool_destroy() on the sub-pool. That causes the sub-sub-pool to be destroyed first, and any cleanup functions registered by the driver in apr_dbd_prepare() will be executed. Next dbd_close() will run and close the DB connection, and finally the ap_dbd_t structure will be freed. The prepared statements will be cleaned up before the the DB handle is closed; Nick pointed out that some drivers may not expect to handle the reverse case (cleaning up statements after the connection is closed). One nice consequence of these changes is that the ap_dbd_[c]acquire() functions can now be slimmed down quite a bit. They don't need to register any cleanups in the non-persistent cases, because this is taken care of for them in dbd_construct(); dbd_close() will be called when the per-request or per-connection pool is destroyed or cleared. Fourth, a more comprehensive fix for the problems first identified circa r424798: http://svn.apache.org/viewvc?view=rev&revision=424798 This and several subsequent patches dealt with the problem that if you had this in a config file: <VirtualHost ...> <Location /...> AuthDBDUserPWQuery ... </Location> </VirtualHost> then when mod_authn_dbd's handler for the AuthDBDUserPWQuery directive ran, the virtual host had no entry for mod_dbd in its module_config, and so when the handler called ap_dbd_prepare() and that function tried to use the value returned from ap_get_module_config(s->module_config, &dbd_module), a segfault would result. It's worth noting that this would occur while processing command directives, before the merge functions were run to merge virtual host configs with the main server's. However, there's still has a problem if you put any mod_dbd commands into the virtual host. In the simplest case they are simply ignored: <VirtualHost ...> DBDPrepareSQL ... </VirtualHost> This is because the post_config hook function introduced to deal with the segfault issue does the following; each virtual host gets its SQL queries to prepare from the main server only because the key is always derived from s, not sp: for (sp = s; sp; sp = sp->next) { const char *key = apr_psprintf(s->process->pool, "%pp", s); svr = ap_get_module_config(sp->module_config, &dbd_module); svr->prepared = apr_hash_get(dbd_prepared_defns, key, APR_HASH_KEY_STRING); } There's no simple fix for this because the server config merge function is no longer dealing with the queries to prepare at all (there's a leftover line that makes it look like it is, but it isn't, because at the merge stage both add->prepared and base->prepared as both still NULL; they only get set in the post_config hook function). More seriously, the following configuration causes mod_authn_dbd to always fail to authenticate (at least in the threaded, persistent case): <VirtualHost ...> DBDPrepareSQL ... <Location /...> AuthDBDUserPWQuery ... </Location> </VirtualHost> This is because the DBDPrepareSQL directive causes the mod_dbd server config creation function to be run for the virtual host (the opposite of the first example above), generating a unique server config for the virtual host. The virtual host's query to prepare is still ignored, but in addition, during authentication mod_authn_dbd gets a NULL ap_dbd_t from ap_dbd_acquire() and fails for each request to the restricted location. That's because when ap_dbd_acquire() calls ap_dbd_open(), the virtual host's server config is the one retrieved by ap_get_module_config(); it's dbpool reslist is NULL because the child_init hook function only sets up a reslist for the main server, so dbd_setup_lock() tries to run, but finds a NULL mutex in the server config and assumes the child_init hook function failed to create a reslist and a mutex. Again, there's no simple patch for this, because it raises the question of how many reslists or connections there should be in the persistent cases -- one for each virtual host? Each virtual host can have its own DBDParams and so forth, of course, but suppose one wanted to have a fairly simple configuration, with 20 virtual hosts connected to one DB, and another 20 connected to another DB? Clearly having 40 pools of connections isn't the right thing to do. My changes aim to deal with all of these issues by introducing a set of "database configuration groups". Server configs are compared in the post_config hook function to find a minimal set of such groups; each group then gets its own pool of connections. Groups are only used for persistent configurations. Queries to prepare are added to a hash during the configuration phases; this allows for normal apr_hash_overlay() merging of main and virtual host server configs, as well as allowing virtual hosts to deregister a query inherited from the main server. It also lets us to warn about conflicts between query labels more easily. Note that if virtual hosts only add additional queries to prepare, distinct configuration groups are not created. Thus using mod_authn_dbd and AuthDBDUser[PW|Realm]Query directives in virtual hosts does not cause extra connection pools; you have to tinker with the DB connection parameters or pool configurations, or explicitly alter or deregister a query in order for a distinct group to be required. Down the road, it might be nice to allow something like this in the main server: <DBDGroup foo> DBDParams ... ... </DBDGroup> so that virtual servers could then just include a "DBDGroup foo" directive and not have to repeat exactly the same mod_dbd directives in order to be recognized as being able to share a group config. However, this requires some config mojo that I just don't have at the moment. The framework for it is all there, though, I think. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B