Not an expert on DB Persistence managers but... why about allowing the definition of a DataSource reference instead of having to hardcode JDBC URLs? => Also as consequence there will be no need for DBCP dependency.
Martin On 2/2/06, Miro Walker <[EMAIL PROTECTED]> wrote: > > Hi, > > We've been discussing the DB PM implementation, and have a couple of > questions regarding the implementation of this. At the moment, the > Simple DB PM appears to have been implemented using a single connection > with all write operations synchronised on a single object. This would > imply that all writes to the database are single threaded, effectively > making any application using it also run single threaded for write > operations. This appears to have two implications: > > 1. Performance - in a multi-user system, having single-threaded writes > to the database will make the JDBC connection a serious bottleneck as > soon as the application comes under load. It also means that any > background processing that needs to iterate over the repository making > changes (and we have a few of those) will effectively bring all other > users to a grinding halt. > > 2. Transactions - we haven't tested this (as the recent support for > transactions in versioning operations has not been integrated into our > system), but it appears that to if a single connection is being used, > then we can only have a single transaction active at any one time. So, > if each user tries to execute a transaction with multiple write > operations in it, and these transactions are to be propagated through to > the database, then each transaction must complete before the next can > begin. This would either mean we get exceptions if the system attempts > to interleave operations from different transactions or that each > transaction must complete in full before another can begin, further > compounding the performance issue. > > In addition to the implications of using a single synchronised > connection, another issue appears to be that the system will be unable > to recover from a connection failure. For example, if the system were > deployed onto a highly available database cluster, then in the event of > DB instance failure, any open connections will be killed, but can quite > happily be reopened later. Jackrabbit appears to create a connection on > initialisation, and has no way to recover if that connection is killed. > > I know that questions around implementing support for connection pooling > on the DB have been raised before and then dismissed as unimportant, but > this appears to me to be pretty fundamental. By using a connection pool > implementation that supports recreating dead connections and supports > providing tying a connection to a transaction context, multiple > transactions could run in parallel, helping throughput and making the > system more reliable. > > What do people think? Could we look to use Jakarta commons dbcp? > > Cheers, > > Miro >