All,

Thanks for all your advice on how to improve the performance.

Here are the things I did.

1. Switched to Castor 0.9.6.
2. Used JDO2. (Though even earlier we were using a single JDO instance with
JDO).
3. Made my transactions as short.
4. Made the transactions as read-only.
5. used db.load() instead of db.execute()
   I could not use db.load() where I need to query for a list of data. I
don't know if I can do that, but from looking at  
   the javadoc , I think we have to use db.execute() for querying a list of
data(i.e when a non primary keys are used to   
   query)
6. Cached the objects time-limited.

We have been using DBCP JNDI with tomcat with
org.apache.commons.dbcp.BasicDataSourceFactory. 

We had performance improvement by a factor of 6. :). Is there anything more
that I can do ? 

Does castor have a distributed cache to use in clustered environment ? 

Since the objects we configured in castor's cache are time-limited, (in our
case 15 mins), in a clustered environment different instances of castor(in
diff app servers) have their own cache. How will the cache in the second
instance be updated if the cache in the first instance is updated via the
application.

Thanks for all your help.

- Vijay

   


-----Original Message-----
From: Gregory Block [mailto:[EMAIL PROTECTED] 
Sent: Saturday, April 30, 2005 4:30 AM
To: [email protected]
Subject: Re: [castor-user] Transaction locks and muti threaded access

On 27 Apr 2005, at 11:30, Ralf Joachim wrote:
> I know that Gregory Block (another castor commiter) is using castor in 
> a high-volume application.

Indeed we do.

> 1. Switch to version 0.9.6 of castor as we have fixed some bugs that 
> may cause some of your problems.

Sidenote:  Performance has, generally, improved recently.  If you're not
seeing performance improvements, then it's worth spending some time thinking
about why.

> 2. Initialize your JDO or JDO2 (will be renamed to JDOManager at next 
> release) instance once and reuse it all over your application.
> Don't reuse the Database instances.

Again:  Never, ever reuse a database instance.  Creating them is
inexpensive, and JDBC rules state that one thread -> one JDBC connection is
the rule.  Do not multithread inside of a Database instance; as a corrolary,
do not multithread on a single JDBC connection.

> 3. Use a Datasource instead of a Driver configuration as they enable 
> connection pooling which gives you a great performance improvement.

I highly suggest DBCP, here, with the beneficial use of prepared statement
caching.

Should you be running on a system where read performance is critical, feel
free to take the SQL code generated by castor, and dumped to logs during the
DB mapping load in debug output, and turn those into stored procedures that
you then invoke via call to perform those loads; however, I find personally
that stored procedures would be a minimal improvement over the DBCP prepared
statement cache; your mileage may vary.  db.load() has performance benefits
that are worth keeping, IMO, and the pleasure of having pretty stored
procedures in your database is far outweighed by the nightmare of change
management.

> 4. Always commit or rollback your transactions and close your Database 
> instances properly also in fail situations as suggested by Nick 
> previously.

Just the obvious general rule on Java objects that hold resources:   
Don't wait for the VM to finalize to have something happen to your objects
when you could have released critical resources at the appropriate point in
the codebase.

> 5. Keep your transactions as short as possible. If you have an open 
> transaction that holds a write lock on an object no other transaction 
> can get a write lock on the same object which will lead to a 
> LockNotGrantedException.

Also keep in mind that folks using lockmode of dblocked do FOR UPDATE calls
on things they read while the transaction is open; if you're  
using dblocked mode, be aware of how your application does things.   
If you're in one of the other modes, locks happen inside castor, and it's
your responsibility to always use the right access mode when accessing
content.

If you can, for example, decide at the API layer whether or not an operation
is going to ever need to modify an object, and know that you will only ever
use an instance in read only mode, load objects with access mode read only,
and not shared.

Limit use of read-write objects to situations in which it is likely you will
need to perform updates.

Read-write performance will change dramatically once the TransactionContext
patches have been checked in; if you'd like to guinea pig them, check JIRA
and try the patch out; we're already using it in production here.

> 6. Query or load your objects read only whenever possible. Even if 
> castor creates a lock on them this does not prevent other threads from 
> reading or writing them. Read only queries are also about 7 times 
> faster compared with default shared mode.

Cannot stress how important this is:  If 99% of your application never
writes an object, and you as a programmer know it won't, then do something
about it.  If you're in a situation where you want the object to be
read-only most of the time, and only want a writable every now and then, do
so just-in-time by performing a load-modify- store operation in a single
transaction for the shareable you want.

In other words:  Don't use read-write objects unless you know you're likely
to want to write them.

> 7. If there is a possibility you should prefer db.load(Class,
> object) over db.execute(String). I suggest that as db.load() first 
> tries to load the requested object from cache and only retrieves it 
> from database when it is not availble there. When executing queries 
> with db.execute() the object will always be loaded from database 
> without looking at the cache. You may gain a improvement by a factor 
> of 10 and more when changing from db.execute() to db.load().

I've never touched db.execute() - it never struck me that you'd want to.  :)

> a. If you have a look at http://jira.codehaus.org/browse/
> CASTOR-1085 where a patch to TransactionContext is attached that 
> improves read/write performance with a factor of 3. Even if the patch 
> passes all tests of castor test framework it requires more testing 
> before we will integrate it in our next major release. As stated in 
> the comment Gregory will use the patch in his production environment 
> sooon.

It's in production now; and several large content imports of guide content
have been run through it without any difficulties or problems in the
generated content.


Now, there's lots left to do - there is still the issue, for example, of
dependent objects being slightly sub-optimal in performance both in terms of
the SQL that gets generated and the way it gets managed - but there will be
improvements over time to the way that this and other operations are
performed.

But performance *should be good right now*.  If it isn't, you'll need  
to think about whether you are using the optimal set of operations.   
No environment can predict your requirements - hinting to the system when
objects can be safely assumed to be read-only is vital to a high-
performance implementation.

Cheers,
Greg


Reply via email to