Re: apr_dbm and concurrency
On 25.09.2023 16:58, Joe Orton wrote: It is unspecified whether the apr_dbm.h interface is safe to use for multiple processes/threads having r/w access to a single database. Results appear to be: - sdbm, gdbm are safe - bdb, ndbm are not safe (Berkeley DB obviously can be used in a way which is safe for multiple r/w users but it appears to require using one of the more complicated modes of operation via a DB_ENV, and changing to that would not be backwards compatible with the current db format. Corrections welcome, not a database expert) IIRC, Berkeley DB multi-process concurrency is managed through an on-disk "register" file external to the actual key/value store. The key/value store format is not affected by the presence of this file. The DB_REGISTER mechanism was introduced in BDB 4.4 (now long defunct) and can be used for both concurrency control and automatic database recovery. The client-side code for this can be lifted from Subversion. (I was involved in designing this mechanism for BDB and implementing its use in Subversion, but that was ages ago -- back in 2005. There may be better ways do do this in newer versions of Berkeley DB). TL;DR: all upstream supported versions of BDB should have this mechanism available and APR can detect if it's being used without changing the API, and even "upgrade" existing databases with the register file on the fly without affecting the actual database. -- Brane
apr_dbm and concurrency
It is unspecified whether the apr_dbm.h interface is safe to use for multiple processes/threads having r/w access to a single database. Results appear to be: - sdbm, gdbm are safe - bdb, ndbm are not safe (Berkeley DB obviously can be used in a way which is safe for multiple r/w users but it appears to require using one of the more complicated modes of operation via a DB_ENV, and changing to that would not be backwards compatible with the current db format. Corrections welcome, not a database expert) This seems pretty bad, httpd's use of this interface depends on the DBM API being safe for concurrent use, but I'm not sure there is any good way forward. Options I can see: 1. Implement APR-specific locking inside apr_dbm for unsafe db types, e.g. by creating a lockfile ".lock" and use apr_file_lock 2. Drop concurrency-unsafe db methods... but, APR 2.x only? 3. No code change. Describe the state of concurrency-safety in the API for each db type. httpd and other users would be forced to select a DB type appropriate to the use case. Any other suggestions, and any preferences among the above? I'm not sure if 3 isn't the least bad choice unfortunately. Regards, Joe