Just in general...

(using gutted code, and a slightly simplified API, as using my own code from our app is frankly more complicated to explain.)

public XMLObjectIF unmarshallContent(Class outputType, File file, URL mappingURL, int cacheTimeHours) throws IOException
{
CachedMappingFactory mf = CachedMappingFactory.getInstance();
                CachedMappingIF mapping = mf.get(cfg, mappingURL);
                long lastModified = file.lastModified();

                XMLObjectIF result = null;

CacheManagerIF cm = (CacheManagerIF) ServiceRegistry.getService(CacheManagerIF.class); String cacheKey = helper.getCacheKey(debugKey, file, mapping);

CachedSerialization cs = (CachedSerialization) cm.get (cacheKey); if (cs!=null && cs.lastModified == file.lastModified ()) {
                        result = cs.objectForm;
                }

                if (result==null) {
                        // perform the unmarshalling.
                        [...]

                        if (result!=null && cacheable) {
CachedSerialization entry = new CachedSerialization();
                                entry.objectForm = result;
entry.lastModified = file.lastModified(); cm.set(cacheKey, entry, cacheTimeHours);
                        }
                }

                return result;
        }
}



That.

Now, what this does, is:

1) XMLObjectIF ensure that all objects serialized and deserialized in the framework obey our basic XMLObjectIF interface which all objects that pass through our system ensure. This ensures, for example, that all objects have all the necessary fields to be used by the caching frameworks as well as everything necessary for Castor.

2) We pass in the class we're expecting to load, and in this particular instance, the file, and the number of hours we want this representation to be considered cacheable. This particular cache is a cache of the *instance* of this XML, not the mapping.

3) Grab our singletons. CachedMappingFactory is our mapping cache; it basically holds a map of URL.toExternalForm() -> object instance, and uses timestamps to figure out when those mappings were last modified. If the timestamp on the mapping file changes, recreate the mapping object and restore. Straightforward and simple.

4) Try and grab a cached representation of this file. We don't cache the object itself - we cache, in this instance, a CachedSerialization: It contains the long timestamp of the original file we generated it from, as well as an identifier that identifies the mapping file and *its* timestamp. An XML representation is valid in exactly one circumstance: That it was loaded with the *exact same* mapping URL and timestamp (used during key generation), and that the last modified timestamp on the file I'm trying to unmarshal is exactly the same as the object's own last modified timestamp. E.g.: The mapping is identical to what it was at the last time, as is the timestamp.

Sidenote: Note that the mapping file is a part of the key, and not a part of the CachedSerialization object; it is perfectly valid and legal to want to take an object and use a variety of different mappings to produce content - this requires that the mapping information which was used to create the serialization is a part of the key, not a part of the instance, or a switch of mappings to generate content would obliterate perfectly good cached content.

5) Call the unmarshal statement with our cached mapping.


Now, a number of things have been left as an excercise to the reader:

1) whether or not caching every XML serialization is "a good idea". We use a dual-cache system, based on a short-term in-memory cache, and a long-term memcached, and our granularity is in hours. Some people might not want to cache. Moreover, I've stripped out all of the brains around deciding whether something is cacheable, as it involves explaining how these things are all wired together in our system, and there's just no point.

2) Whether or not you need an object around mapping caching with a factory for loading. Quite frankly, doing it well means doing it from a variety of different sources - two, at minimum: Mapping by URL (getResource() from WAR/JAR) and mapping by File (absolute paths in the filesystem). At some point, we're even going to have to add in mapping via Slide API, as some of our content is moving into a webdav store. It's relatively straightforward, and compartmentalizes all of the logic for handling mapping loading to a single place that you can then write simple unit tests for. Fundamentally, just load a mapping, and keep it around. If you don't care about checking for modifications while the application is still running, and feel like restarting your JVM every time you want to test, feel free to just stick it in a map, ignore all the long timestamp matches, and go nuts.

3) How to load a mapping; how to perform marshalling and unmarshalling. If you're reading this, I hope you're already aware. :)


Now, one of the things we don't do internally - yet - is provide any way of handing Castor a factory for SAX objects. SAX objects are expensive to create, and have a reuse API; but the object pooling just isn't there right now. At some point, either it will gain object pools of its own, or you'll get an API for registering a parser factory with your current run, allowing you to share your current SAX parser pool with the rest of your system.

If you've cached the mapping, and keep that Mapping object around, then every time you want to parse, it's just a matter of firing up a sax parser and doing the dirty work. Once the sax parser instances are reused, it should get even better - less hunting around in jars and classloading is always a good thing.

But far and away, one of the most expensive things you can do in the system is create mappings. We take the file, load it, look at the classes you're trying to map, dig out, through introspection, all of the methods that match all of our mappings, and build the information structures that allow us to get and set values on your classes when we encounter data in the XML, or go to write it.

As it says in the docs - somewhere - caching that mapping file is important if you're going to expect high-performance, reusable behavior. JDO does this as a side-effect of its operation - loading the configuration loads the mappings and holds them, for its lifetime, in the JDO object. On the XML side, you're not given that kind of magical automata, and it's up to you to do the right thing. (And on the JDO side, that magical automata doesn't always work in our favor; it raises lifecycle questions.)


Maintain good object lifecycles for the data you depend on, and watch your apps fly.


On 2 Aug 2005, at 12:17, Mesut Celik wrote:

great news Gregory,

as you stated below, i dont use any static method of castor.

can you guide me through how i can cache the mapping file? and if you
have any other tips which is supposed to be very valuable, it'd be
highly appreciated both for me and for community as well.


-------------------------------------------------
If you wish to unsubscribe from this list, please send an empty message to the following address:

[EMAIL PROTECTED]
-------------------------------------------------

Reply via email to