I know that Gregory Block (another castor commiter) is using castor in a high-volume application.
Indeed we do.
1. Switch to version 0.9.6 of castor as we have fixed some bugs that may cause some of your problems.
Sidenote: Performance has, generally, improved recently. If you're not seeing performance improvements, then it's worth spending some time thinking about why.
2. Initialize your JDO or JDO2 (will be renamed to JDOManager at next release) instance once and reuse it all over your application. Don't reuse the Database instances.
Again: Never, ever reuse a database instance. Creating them is inexpensive, and JDBC rules state that one thread -> one JDBC connection is the rule. Do not multithread inside of a Database instance; as a corrolary, do not multithread on a single JDBC connection.
3. Use a Datasource instead of a Driver configuration as they enable connection pooling which gives you a great performance improvement.
I highly suggest DBCP, here, with the beneficial use of prepared statement caching.
Should you be running on a system where read performance is critical, feel free to take the SQL code generated by castor, and dumped to logs during the DB mapping load in debug output, and turn those into stored procedures that you then invoke via call to perform those loads; however, I find personally that stored procedures would be a minimal improvement over the DBCP prepared statement cache; your mileage may vary. db.load() has performance benefits that are worth keeping, IMO, and the pleasure of having pretty stored procedures in your database is far outweighed by the nightmare of change management.
4. Always commit or rollback your transactions and close your Database instances properly also in fail situations as suggested by Nick previously.
Just the obvious general rule on Java objects that hold resources: Don't wait for the VM to finalize to have something happen to your objects when you could have released critical resources at the appropriate point in the codebase.
5. Keep your transactions as short as possible. If you have an open transaction that holds a write lock on an object no other transaction can get a write lock on the same object which will lead to a LockNotGrantedException.
Also keep in mind that folks using lockmode of dblocked do FOR UPDATE calls on things they read while the transaction is open; if you're using dblocked mode, be aware of how your application does things. If you're in one of the other modes, locks happen inside castor, and it's your responsibility to always use the right access mode when accessing content.
If you can, for example, decide at the API layer whether or not an operation is going to ever need to modify an object, and know that you will only ever use an instance in read only mode, load objects with access mode read only, and not shared.
Limit use of read-write objects to situations in which it is likely you will need to perform updates.
Read-write performance will change dramatically once the TransactionContext patches have been checked in; if you'd like to guinea pig them, check JIRA and try the patch out; we're already using it in production here.
6. Query or load your objects read only whenever possible. Even if castor creates a lock on them this does not prevent other threads from reading or writing them. Read only queries are also about 7 times faster compared with default shared mode.
Cannot stress how important this is: If 99% of your application never writes an object, and you as a programmer know it won't, then do something about it. If you're in a situation where you want the object to be read-only most of the time, and only want a writable every now and then, do so just-in-time by performing a load-modify- store operation in a single transaction for the shareable you want.
In other words: Don't use read-write objects unless you know you're likely to want to write them.
7. If there is a possibility you should prefer db.load(Class, object) over db.execute(String). I suggest that as db.load() first tries to load the requested object from cache and only retrieves it from database when it is not availble there. When executing queries with db.execute() the object will always be loaded from database without looking at the cache. You may gain a improvement by a factor of 10 and more when changing from db.execute() to db.load().
I've never touched db.execute() - it never struck me that you'd want to. :)
a. If you have a look at http://jira.codehaus.org/browse/ CASTOR-1085 where a patch to TransactionContext is attached that improves read/write performance with a factor of 3. Even if the patch passes all tests of castor test framework it requires more testing before we will integrate it in our next major release. As stated in the comment Gregory will use the patch in his production environment sooon.
It's in production now; and several large content imports of guide content have been run through it without any difficulties or problems in the generated content.
Now, there's lots left to do - there is still the issue, for example, of dependent objects being slightly sub-optimal in performance both in terms of the SQL that gets generated and the way it gets managed - but there will be improvements over time to the way that this and other operations are performed.
But performance *should be good right now*. If it isn't, you'll need to think about whether you are using the optimal set of operations. No environment can predict your requirements - hinting to the system when objects can be safely assumed to be read-only is vital to a high- performance implementation.
Cheers, Greg

