I could not use db.load() where I need to query for a list of data. I
don't know if I can do that, but from looking at
the javadoc , I think we have to use db.execute() for querying a list of
data(i.e when a non primary keys are used to
query)
Personally, I run a stored procedure to return the list of IDs, then hand the ID array off to a 'bulk loader' which, for read only requests:
- Sticks the load requests, individually, into a work queue of 25-50 threads which are responsible for DB loads
- Creates a mutex for the given object type + id, and locks it, to ensure that no two applications can perform the same request for the same object simultaneously.
- Performs a cache lookup.
* This first checks a 10-minute "short term" memory which consists of all of the objects we've asked for access within the last 10 minutes, to reduce cost of deserialisation.
* This then checks a set of standalone memcached servers for a long-term cache entry; object cache lengths are determined by their appropriate loader for the specific type of object; objects pulled out of long-term cache are placed into short-term.
- Performs the db.load on the individual requested object ID; objects pulled out of DB are placed into both long and short-term cache.
At some point, once Castor moves to a 'bulk load' scenario, the cache check will be parallelised separately from the bulk load request, which should improve the load times on these objects more efficiently than parallelisation.
Strategies depend on what you're trying to do; those who rely on synchronized caches, such as JCache, have no need for a long/short- term distinction, and it just gets in the way of synchronization.
However, you can successfully write stored procedures which return nothing more than the ID, and then load those IDs individually.
Does castor have a distributed cache to use in clustered environment ?
There's a few in progress which will make it into a future release; but for the most part, I have to say, they're about to become awfully easy to write (implement Map, basically). The bigger problem is on which distributed cache backend you intend to use, and what the licensing on them is.
There are good commercial ones, a good one under the Apache license, a few under GPL/LGPL; it depends on your project. Licensing may prevent us from releasing a cache for every system under the sun.
As for whether or not we'll be writing our own cross-server multicast framework: IMO, we haven't the time, interest, or fascination for wheel reinvention, especially when other people are only too willing to help tackle the hard stuff in their own projects. :) Find a framework and use it; each has different strengths and weaknesses.
There will be a general performance improvement in the cards for the non-synchronized built-in implementations of caches at some point in the future, and those, too, may offer some important features to users.
Since the objects we configured in castor's cache are time-limited, (in our
case 15 mins), in a clustered environment different instances of castor(in
diff app servers) have their own cache. How will the cache in the second
instance be updated if the cache in the first instance is updated via the
application.
I think you already know the answer to that question. ;)
Cheers, Greg

