Re: [PROPOSAL] Cache project...

James Strachan Tue, 19 Jun 2001 06:20:02 -0700

Title: RE: [PROPOSAL] Cache project...

Hi Rod

I've started to use your cache implementation in earnest lately, its a good piece of work, thanks. I've been meaning to evaluate Aaron's work also and see how we can all combine our ideas.

I have a few comments on your API.

* Is there much code that depends on this API yet. Can we tweak the API somewhat without burdening people with excessive maintenance?

* I'd prefer a few method names to closer reflect the Java Collections Framework naming for consistency

get() rather than retrieve()

put() rather than store()

remove(key) rather than clear(key)

* a size() method to return the number if items in the cache would be useful

* some mechanism for iterating through the current cache contents, getting an iterator through the keys or something of that ilk would be useful. e.g.

Iterator keyIterator();

* What do you think about relaxing the prerequisite that all keys and values in the cache must be Serializeable? I understand for RMI and writing to disk this makes sense but I often want to cache transient objects (or transient transformations of persistent state). (The RMI or disk serializers could always ignore cache content that is not Serializeable). Often we want to cache things to avoid remote communication or reading from disk.

* I'd like a method like the put() method from Map

Object put( Object key, Object value )

which is similar to the current method:-

boolean store( Serializable key, Serializable value, Long expiry, Long cost )

but

(i) the previous value is returned rather than a true / false indicator of whether this new value was stored. This makes it easy to do change listeners and such like. I can't see to many application developers caring too much whether store() actually stores the object or not. Do you have a specific need for this boolean? Returning the previous value would also mirror the Map interface in the Java Collections Framework.

(ii) the values of the expiry and cost arguments come from the cache themselves rather than the application programmer. Defaults could be applied to a particular cache (region) via a config file or whatever. Often an application developer doesn't know what these values are

* Similarly I'd like the remove() method to return the previous value that was in the cache or null if there is nothing cached for the key, to be similar to the Map.remove(key) method.

public Object remove( Object key );

Thoughts?

James

----- Original Message -----

From: Waldhoff, Rodney

To: [EMAIL PROTECTED]

Cc: 'Daniel Hoppe '

Sent: Friday, May 18, 2001 2:05 PM

Subject: RE: [PROPOSAL] Cache project...

For what it's worth, I too have a home-grown cache implementation that we've been pretty happy with. We use this cache in a number of places, from caching JSP output (using a set of <cache/> tags), to caching database results, and like Daniel suggests, computationally expensive results.

I had actually been planning to propose a cache project as well, so I think I'm in favor, but I have some design suggestions.

Like Craig suggests, the interface is essentially the same as ObjectPool or HashMap--i.e., put an object into the cache, get an object from the cache--but of course the cache can return multiple copies of the object, and the put operation often includes extra attributes such as time-to-live or cost. Our basic cache interface looks like:

boolean store(Serializable key, Serializable val, Long expiry, Long cost)
          Store the specified val under the specified key.
Serializable retrieve(Serializable key)
          Obtain the value previously stored under the given key.
void clear()
          Remove all values previously stored.
void clear(Serializable key)
          Remove any value previously stored under the given key.
boolean contains(Serializable key)
          Returns true if I have a value associated with the given key, false otherwise.

The cache can publish store/retieve/evict events to listeners:

void registerRetrievalListener(RetrievalListener obs)
          Add the given RetrievalListener to my set of RetrievalListeners.
void registerStorageListener(StorageListener obs)
          Add the given StorageListener to my set of StorageListeners.
void unregisterRetrievalListener(RetrievalListener obs)
          Remove the given RetrievalListener from my set of RetrievalListeners.
void unregisterRetrievalListeners()
          Clear my set of RetrievalListeners.
void unregisterStorageListener(StorageListener obs)
          Remove the given StorageListener from my set of StorageListeners.
void unregisterStorageListeners()
          Clear my set of StorageListeners.

The cache also contains the notion of a "group"--a sort of meta-key that can be associated with more than one object in the cache:

boolean store(Serializable key, Serializable val, Long expiry, Long cost, Serializable group)
          Store the specified val under the specified key and the specified group.
Serializable[] getKeysForGroup(Serializable group)

void clearGroup(Serializable group)
          Remove any value previously stored under the given group.

The main thing that we do right in this cache impl., I think, is that we treat the Cache as an aggregation of 'policy' objects, making it easy to create different types of caches and/or different cache configurations. Specifically, the Cache is an aggregation of:

* A "Stash" which is a physical storage mechanism for cached objects (memory, disk, database, etc.)

* An (optional) "StashPolicy" which determines whether or not a given object is cacheable or not

* An (optional) "EvictionPolicy" which determines which objects to evict (remove from the cache) when the cache is full

(Least Recently Used, Least Relative Value, Least Frequently Used, etc.)

* A "StaleObjectEvictor" which removes stale (expired) objects from the cache.

I definitely think that that is the right general approach.

> Altold, do you think we are trying to achieve something
> similar or is my approach too much heavy-wheight for
> your needs?

Daniel, I think we're talking about something similiar, don't you?

-----Original Message-----
From: Daniel Hoppe
To: '[EMAIL PROTECTED]'
Cc: Felix Schauerte; Stefan Siprell
Sent: 5/17/01 4:10 PM
Subject: AW: [PROPOSAL] Cache project...

James, Craig,

I did not follow the ObjectPool discussions to closely, so I hope I'm
not
missing the point. As far as I got it

- the pool hands out an object exactly once. The object is unavailable
for
others until it is returned
- the cache may hand out an object several times. The object is not
exclusively used.

The cache could therefore be used for e.g. complex computation results,
results of database queries and other computing time intensive tasks.
The
object pool contains scarce resources, e.g. typically database
connections.

James, what do you think of making a cache JMX compliant? I'm working on
a
cache which is supposed to buffer data in a content management system.
The
cache is supposed to store objects which are quite expensive, the
caching
will be crucial for application performance That's why I need to have a
good overview on what's happening inside the cache, not just as some
debugging output but rather in a fashion that can be remote monitored
and
integrated with monitoring tools.

I plan to implement the measures
- objects in cache,
- cache hits,
- invalidations,
- cache misses of key that have not been in the cache yet,
- cache misses due to a constraint of maximum cache entries.

The last two points might seem a little bit strange at the first
thought,
but I think they can make a sense in certain situations. If a cache has
the
option of either using soft references or a maximum number of entries,
there
will be a certain amount of cache misses due to either a limited heap
size
in case of soft references or a limited number of maximum entries
allowed
(which is most probably related to heap size as well.

With this monitoring information a sysadmin could easily determine if
e.g.
an installation would benefit from an increased heap size.

To distinguish between both types of cache misses it is of course
necessary
- to keep a list of keys which are already known to the cache
- to have a mechanisim to finally drop keys after a certain while (e.g.
the
key of a deleted page in a content management system should not remain
in
the cache for weeks and months).

This implies that the value object needs to have noticeably higher
amount of
heap consumption and computation time on creation than the key,
otherwise
the cache would of course not make much of a sense.

In my current prototype I'm supplying three types of caches, a cache
which,
when full,

- drops the oldest value objects,
- drops the ones with the longest interim since the last hit
- drops the ones with the fewest number of total hits

I can configure which kind of references are used (strong, soft, weak).

I like the idea of a cacheloader. Thought of that as well, but somehow
did
not have the drive to implement that in my ejb environment yet (might be
messy if the cacheloader has to deal with FinderExceptions of EJBs).

What I did not fully get so far is the idea of the cache-regions. I
always
thought of putting a cache instance to some central location (Web
Application Context, JNDI Tree, e.g.), but maybe my view is to J2EE
focused
on this. I'm a little bit sceptical about a cache being a static member
as
there are some restrictions on that in the EJB spec..

Altold, do you think we are trying to achieve something similar or is my
approach too much heavy-wheight for your needs?

Cheers,

Daniel

Re: [PROPOSAL] Cache project...

Reply via email to