On Jul 25, 2007, at 00:51, Andreas L Delmelle wrote:
Without doing any measurements, one can't say for certain, but I'm guessing the risk of thread contention is reasonably high.
Already did a few measurements, albeit no real FOP-related tests, but purely focused on the concurrency behavior of PropertyCache itself. The results seem somewhat disconcerting, IIC.
The more threads you spawn, and run simultaneously, the higher the penalty per thread. For maximum effect, try spawning 100 threads on a multi-CPU machine, and compare the average duration of a single thread performing 100000-200000 fetches with performing the same task 100 times in sequence.
But even for as little as 10 threads, there already seems to be a severe performance penalty.
As to that number, I must admit that I have no real idea on realistic figures for testing. Maybe 100 threads is a bit over the top... OTOH, if you look at the original profile Jeremias posted last week, we do see that PropertyCache.fetch() would be called roughly 100.000 times for NumberProperty alone for the same 10 instances, for such a document. Very right to be concerned.
So, I got to thinking, and for that matter, more than only thinking... <snip />
That said, Java 1.5 has added a neat util.concurrent package with, besides the famous Semaphore, for example, a ConcurrentHashMap implementation. Supposed to perform far better in case of heavy contention, since it never locks the entire underlying table. Tricky implementation to simulate under 1.4 though, AFAIU. :-(
... and here's the catch:Tricky, but certainly not impossible, so I found out. Cost me a bit of my time, but I hijacked some code from ConcurrentHashMap and HashMap to rewrite the PropertyCache as a 'WeakConcurrentHashMap', only, it's not really map. More a hashtable dedicated to storing only Property instances as keys. No need to store them also as (weak referents of) values, indeed!
Instead of making it a sort of proxy to a synchronizedMap, I went all the way down to the roots, gave the PropertyCache its own table, implemented get() and put() methods but made those only accessible to the fetch() method, etc.
Most tricky were in fact the cleaner thread and the rehash, which are implemented as follows: If any put() operation pushes the number of entries in a segment above double the amount of segments, it spawns a cleaner thread that tries to clean up the WeakReferences in that segment first. Obviously, since many of these cleaner threads can be spawned simultaneously, each one does a re-check under synch first to see if another already carried the load... if that is so, the thread immediately dies. If that has no effect, rehash() is called, which blocks the entire cache. It first checks to see if more than half the amount of segments exceed this threshold. If that is the case, the amount of segments is doubled and the entries are redistributed. The larger the cache grows, the less likely it becomes that it is ever completely blocked, and at the same time, the cleaner threads always makes sure that obsolete references will be removed before increasing the size, so the cache should never grow unnecessarily large.
Almost didn't believe it myself at first, but if you compare the timings... I almost suspect that I overlooked something, but everything I tested so far seems to be working. Definitely works for FOP.
Yes, it's an increased overhead in maintenance to 'roll your own', but still, if you look at the speed difference, that may be convincing enough.
No more words! The code: see attachments for the revised PropertyCache + a tester class. Try it out, compare with the minimal PropertyCache in current FOP trunk, and let me know if there's any objection to committing it before the release (even if only as a temporary measure, so we can at least offer the reduced footprint without too much of a penalty in concurrent runs)
Description: Binary data
Description: Binary data