Re: More about the effects of the PropertyCache
On Aug 9, 2007, at 19:00, Andreas L Delmelle wrote: On Aug 8, 2007, at 08:30, Manuel Mall wrote: Not quite my point. When you put your private implementation of an int HashMap forward there were concerns raised about this as hard to maintain and not really required. Now we have a custom implementation of a concurrent hash map. Same concerns apply IMO. Concerns as Simple and Stupid as the cache IMO. Before anyone (Manuel or Vincent, who opposed the idea of the IntMap in the first place) feels offended, let me clarify: Why did you think I posted a proposal first, this time? I do understand the concerns, and I agreed for the IntMap, since ultimately it was only used for a very small portion of FOP. The PropertyCache on the other hand, is a relatively crucial component as it heavily used (indirectly by the PropertyParser, for instance). As such, IMO, the added maintenance here is peanuts compared to the speed and memory benefits --provided that there is a heavy, noticeable performance penalty in real-time, and not only in the simulations I did. That said, I did not read a veto anywhere... :o) Cheers Andreas
Re: More about the effects of the PropertyCache
On Aug 8, 2007, at 08:30, Manuel Mall wrote: Not quite my point. When you put your private implementation of an int HashMap forward there were concerns raised about this as hard to maintain and not really required. Now we have a custom implementation of a concurrent hash map. Same concerns apply IMO. Concerns as Simple and Stupid as the cache IMO. Cheers Andreas
Re: More about the effects of the PropertyCache
On Wednesday 08 August 2007 04:02, Andreas L Delmelle wrote: > On Aug 4, 2007, at 13:22, Vincent Hennebert wrote: > > Manuel Mall a écrit : > > > > > >> I have been following this discussion with very little attempt to > >> understand the intricate technical details of concurrent maps etc, > >> but > >> I am wondering why we don't apply the KISS principle here? > > Oh, but it is Simple and Stupid. :-) > Much simpler and much more stupid than our layoutengine or hyphenator > or property resolution code Not quite my point. When you put your private implementation of an int HashMap forward there were concerns raised about this as hard to maintain and not really required. Now we have a custom implementation of a concurrent hash map. Same concerns apply IMO. > > Cheers > > Andreas Manuel
Re: More about the effects of the PropertyCache
On Aug 4, 2007, at 13:22, Vincent Hennebert wrote: Manuel Mall a écrit : I have been following this discussion with very little attempt to understand the intricate technical details of concurrent maps etc, but I am wondering why we don't apply the KISS principle here? Oh, but it is Simple and Stupid. :-) Much simpler and much more stupid than our layoutengine or hyphenator or property resolution code or... The initial version I posted earlier was in fact so stupid it still contained a huge design flaw. Lucky for me, nobody actually took a look at the code... Either that, or I should assume that a whole lot of people looked at it, but no one understood it. But that doesn't make it 'not simple'! IIRC the original problem was that the FOP memory footprint for rendering large documents was causing issues. One set of culprits that were identified were the properties. Given that a FOP rendering run is single threaded, i.e. there are no threads created within FOP, why don't we start with a property cache per run? We have already started (a couple of months ago)... with static final caches. An idea that seems to do the job nicely, although there are some reservations on the performance penalty because the cache needs to be synchronized in that case. OTOH, Richard also raised the question, and to be honest, I don't have a clue for the moment: What are realistic numbers to measure that penalty with? I tried spawning 10 threads, and the penalty was already significant, but... This is supposing that all those 10 threads would need to access the /same/ cache at the exact /same/ instant. If each thread spends less than 10% of the time parsing properties, the chances of that happening become very little. No threading issues, no performance issues, and large gains in memory footprint reduction. Indeed, but I think that implementing that would be much more difficult than the current solution. Much less adhering to the beloved KISS-principle than a Simple and Stupid home-made Hashtable... :-) That will even benefit the memory footprint of concurrent FOP runs. Admittedly not to the extend as globally shared cache would do, but it would be much simpler and we can use the standard Java collection classes to implement it. I’m afraid I must agree with Manuel here; although I’m not very familiar with that whole area and I may well be also missing something. It seems to me that the main problem of FOP is that it isn’t able to render big documents, and that properties only play a part in that problem. It might be more useful to try and optimize the whole rendering process, from which everyone will benefit, those running FOP on a server as well as all others. That’s not the same kind of effort but it’s as much important IMHO. A cache per rendering run would do the thing, wouldn’t it? Coupled with a flyweight factory for those properties with a small number of possible values, which themselves could be shared among the different threads. Yes, but see my comments above: to implement this, we would be introducing yet another set of classes, making the property resolution code even more complicated than it already is. It only / seems/ simpler... Also, maybe it’s worth keeping in mind that while that’s not currently the case, we want to eventually make the rendering process multi-threaded. Although the two issues might actually not interfere. Well, if anything, the big PRO for a concurrent, thread-safe cache should precisely be the prospect of FOP multithreading *internally*. In that case, the current solution --possibly over time again backed by a standard 1.5 ConcurrentHashMap-- at least offers something which a rendering-run-local cache would lack. Cheers Andreas
Re: More about the effects of the PropertyCache
Hi, Manuel Mall a écrit : > I have been following this discussion with very little attempt to > understand the intricate technical details of concurrent maps etc, but > I am wondering why we don't apply the KISS principle here? > > IIRC the original problem was that the FOP memory footprint for > rendering large documents was causing issues. One set of culprits that > were identified were the properties. Given that a FOP rendering run is > single threaded, i.e. there are no threads created within FOP, why > don't we start with a property cache per run? No threading issues, no > performance issues, and large gains in memory footprint reduction. > > That will even benefit the memory footprint of concurrent FOP runs. > Admittedly not to the extend as globally shared cache would do, but it > would be much simpler and we can use the standard Java collection > classes to implement it. I’m afraid I must agree with Manuel here; although I’m not very familiar with that whole area and I may well be also missing something. It seems to me that the main problem of FOP is that it isn’t able to render big documents, and that properties only play a part in that problem. It might be more useful to try and optimize the whole rendering process, from which everyone will benefit, those running FOP on a server as well as all others. That’s not the same kind of effort but it’s as much important IMHO. A cache per rendering run would do the thing, wouldn’t it? Coupled with a flyweight factory for those properties with a small number of possible values, which themselves could be shared among the different threads. Also, maybe it’s worth keeping in mind that while that’s not currently the case, we want to eventually make the rendering process multi-threaded. Although the two issues might actually not interfere. WDYT? Vincent
Re: More about the effects of the PropertyCache
On Friday 03 August 2007 07:01, Andreas L Delmelle wrote: > On Aug 1, 2007, at 10:06, Manuel Mall wrote: > > On Wednesday 01 August 2007 15:45, > > [EMAIL PROTECTED] > > > > What am I missing? > > You are not missing anything, AFAICT except that: > ... FOP still aims for 1.4 compliance > ... java.util.concurrent and like classes have not been implemented > under 1.4 yet... > Andreas, you didn't quite answer my question. Why do we need a shared property cache and therefore synchronisation, etc.? Why can't we have a simple cache per FOP rendering run? > > Cheers > > Andreas Manuel
Re: More about the effects of the PropertyCache
On Aug 1, 2007, at 10:06, Manuel Mall wrote: On Wednesday 01 August 2007 15:45, [EMAIL PROTECTED] What am I missing? You are not missing anything, AFAICT except that: ... FOP still aims for 1.4 compliance ... java.util.concurrent and like classes have not been implemented under 1.4 yet... Cheers Andreas
Re: More about the effects of the PropertyCache
On Wednesday 01 August 2007 15:45, [EMAIL PROTECTED] wrote: > Andreas L Delmelle writes: > > I've been playing with trying to make the cache thread-local > > If in doubt, you could always make the caching multi-level. > L1 is the current global cache. L2 is a thread-local equivalent > of the same - all pointing to the same properties (if required) > but with different access keys. The L2 cache in this case need > have no contention at all, > I have been following this discussion with very little attempt to understand the intricate technical details of concurrent maps etc, but I am wondering why we don't apply the KISS principle here? IIRC the original problem was that the FOP memory footprint for rendering large documents was causing issues. One set of culprits that were identified were the properties. Given that a FOP rendering run is single threaded, i.e. there are no threads created within FOP, why don't we start with a property cache per run? No threading issues, no performance issues, and large gains in memory footprint reduction. That will even benefit the memory footprint of concurrent FOP runs. Admittedly not to the extend as globally shared cache would do, but it would be much simpler and we can use the standard Java collection classes to implement it. What am I missing? > Richard Manuel
Re: More about the effects of the PropertyCache
Andreas L Delmelle writes: > Been reading up a bit on concurrency and synchronization and the > likes, and I'm in the meantime quite convinced that the strategy > should indeed be altered to move away from static final Maps. Don't > know if Richard is still listening in and has any opinions/ideas? I'm listening again. Been very busy for the last few weeks - moving etc. I've missed most of the recent conversations so if there is anything important, relevant and not covered on the 41044 messages please let me know. Back to the concurrency, I'm not sure entirely why this is an major issue. As I understand it the generation of a single fop document is mostly linear - i.e. single threaded. Assuming we're dealing with the case of multiple documents simultaneously (e.g. in a webapp), does anyone have usable numbers on the number of possible threads? > I've been playing with trying to make the cache thread-local If in doubt, you could always make the caching multi-level. L1 is the current global cache. L2 is a thread-local equivalent of the same - all pointing to the same properties (if required) but with different access keys. The L2 cache in this case need have no contention at all, Richard
Re: More about the effects of the PropertyCache
Andreas L Delmelle wrote: The CommonBorderPaddingBackground is still a memory eater, but I'll have to take a look at LengthRangeProperty and CondLengthProperty first for that. The trick is probably to split it BPBs in a set of "specified data" which can be cached and the "resolved data". Another idea I pursued (without success) in 0.20: some properties can only have a very limited set of values, in particular text-decoration, keeps, and a few others, which can probably implemented as singletons instead of using a cache. For properties like hyphenation-ladder-count which are likely to have only a few values from a potentially unlimited domain, a mix of pre-fabricated singletons for the most likely used values and a list in case another value is specified, without pruning during rendering, might provide the best balance between memory efficacy and performance. J.Pietschmann
Re: More about the effects of the PropertyCache
On Jul 28, 2007, at 17:21, Andreas L Delmelle wrote: FWIW: I just couldn't resist extending further upon the cache, and in the meantime I have locally implemented similar caching for CommonHyphenation (already proposed last week), CommonFont and KeepProperty. For CommonFont, since font-size and font-size-adjust may require dynamic evaluation context, I split up the class internally, so CommonFont itself gets only three references: fontSize, fontSizeAdjust and a CachedCommonFont instance containing the rest of the properties. If font-size and font-size-adjust are absolute numerics, then the whole CommonFont instance is also cached in turn. If not, then a separate instance will be created, but the six member variables for the cacheable props are traded for one member reference to a cached bundle of those props, so the non-cached instances that do get created should already be a lot smaller than they are now. For KeepProperty, the trick was to use the cache, not at creation time (since it's a compound it is currently not possible to do so: see bugzilla 41044 for the explanation), but in the getKeep() method which is used by FObj.bind() of the different FO subclasses. The scope of the non-cached KeepProperty instances is thereby limited to the PropertyList. The FOs will always get references to cached Keeps. Those few changes should already significantly alter the looks of a dump like the one we saw last week. More Blocks, TableCells etc. The CommonBorderPaddingBackground is still a memory eater, but I'll have to take a look at LengthRangeProperty and CondLengthProperty first for that. Unless any objections arise, I will commit the proposed changes tomorrow evening. Now, it's time for me to start that Wiki-page with the results of the user-poll... Later Andreas
Re: More about the effects of the PropertyCache
On Jul 25, 2007, at 00:51, Andreas L Delmelle wrote: Without doing any measurements, one can't say for certain, but I'm guessing the risk of thread contention is reasonably high. Already did a few measurements, albeit no real FOP-related tests, but purely focused on the concurrency behavior of PropertyCache itself. The results seem somewhat disconcerting, IIC. The more threads you spawn, and run simultaneously, the higher the penalty per thread. For maximum effect, try spawning 100 threads on a multi-CPU machine, and compare the average duration of a single thread performing 10-20 fetches with performing the same task 100 times in sequence. But even for as little as 10 threads, there already seems to be a severe performance penalty. As to that number, I must admit that I have no real idea on realistic figures for testing. Maybe 100 threads is a bit over the top... OTOH, if you look at the original profile Jeremias posted last week, we do see that PropertyCache.fetch() would be called roughly 100.000 times for NumberProperty alone for the same 10 instances, for such a document. Very right to be concerned. So, I got to thinking, and for that matter, more than only thinking... That said, Java 1.5 has added a neat util.concurrent package with, besides the famous Semaphore, for example, a ConcurrentHashMap implementation. Supposed to perform far better in case of heavy contention, since it never locks the entire underlying table. Tricky implementation to simulate under 1.4 though, AFAIU. :-( ... and here's the catch: Tricky, but certainly not impossible, so I found out. Cost me a bit of my time, but I hijacked some code from ConcurrentHashMap and HashMap to rewrite the PropertyCache as a 'WeakConcurrentHashMap', only, it's not really map. More a hashtable dedicated to storing only Property instances as keys. No need to store them also as (weak referents of) values, indeed! Instead of making it a sort of proxy to a synchronizedMap, I went all the way down to the roots, gave the PropertyCache its own table, implemented get() and put() methods but made those only accessible to the fetch() method, etc. Most tricky were in fact the cleaner thread and the rehash, which are implemented as follows: If any put() operation pushes the number of entries in a segment above double the amount of segments, it spawns a cleaner thread that tries to clean up the WeakReferences in that segment first. Obviously, since many of these cleaner threads can be spawned simultaneously, each one does a re-check under synch first to see if another already carried the load... if that is so, the thread immediately dies. If that has no effect, rehash() is called, which blocks the entire cache. It first checks to see if more than half the amount of segments exceed this threshold. If that is the case, the amount of segments is doubled and the entries are redistributed. The larger the cache grows, the less likely it becomes that it is ever completely blocked, and at the same time, the cleaner threads always makes sure that obsolete references will be removed before increasing the size, so the cache should never grow unnecessarily large. Almost didn't believe it myself at first, but if you compare the timings... I almost suspect that I overlooked something, but everything I tested so far seems to be working. Definitely works for FOP. Yes, it's an increased overhead in maintenance to 'roll your own', but still, if you look at the speed difference, that may be convincing enough. No more words! The code: see attachments for the revised PropertyCache + a tester class. Try it out, compare with the minimal PropertyCache in current FOP trunk, and let me know if there's any objection to committing it before the release (even if only as a temporary measure, so we can at least offer the reduced footprint without too much of a penalty in concurrent runs) PropCacheTester.java Description: Binary data PropertyCache.java Description: Binary data Cheers, Andreas
Re: More about the effects of the PropertyCache
On Jul 25, 2007, at 00:51, Andreas L Delmelle wrote: Hi all One more idea before I go to bed: In the meantime, I'll see if I can think up some tests/timings and make them part of the build. Jeremias wants to measure this for the past evolution, and I'd like to make it easier for us to track it in the future. In the longer term, we could maybe set up a page with statistics for this on the website (?) Would it make sense to build functionality into FOP to attach a chronometer to the process? An object, attached to ... the FOEventHandler, and can from there also be passed up the layout-tree. One that each type of object we consider to be critical can register timings on of its most important methods or something...? Cheers Andreas
More about the effects of the PropertyCache
Hi all Been reading up a bit on concurrency and synchronization and the likes, and I'm in the meantime quite convinced that the strategy should indeed be altered to move away from static final Maps. Don't know if Richard is still listening in and has any opinions/ideas? Without doing any measurements, one can't say for certain, but I'm guessing the risk of thread contention is reasonably high. OTOH, I also haven't measured how much time each thread would actually spend parsing properties. If that percentage is only marginal, then the risk would be not be immediately catastrophic, unless in case of a very large amount of concurrent threads where chances increase that two threads need the same Map at the exact same instance. In the meantime, I'll see if I can think up some tests/timings and make them part of the build. Jeremias wants to measure this for the past evolution, and I'd like to make it easier for us to track it in the future. In the longer term, we could maybe set up a page with statistics for this on the website (?) That said, Java 1.5 has added a neat util.concurrent package with, besides the famous Semaphore, for example, a ConcurrentHashMap implementation. Supposed to perform far better in case of heavy contention, since it never locks the entire underlying table. Tricky implementation to simulate under 1.4 though, AFAIU. :-( The benefit remains, though, since in the above example, given the results Jeremias provided, the save on those different Property instances together can easily mean a couple MB of extra heap per thread. I've been playing with trying to make the cache thread-local, but I did not immediately find a solution for situations like: - NumberProperty.getLength() has no context at all, but requires access to the cache of FixedLengths. Seems like NumberProperty should get a reference to the document it belongs to --a direct reference to the Root?-- from where it could gain access to a document-global cache of properties... The increase due to the reference is still insignificant compared to the save on distinct instances, and it would make the caches guaranteed thread-safe (since FOP is sequential internally) without the need for synchronization. On the other hand, as mentioned in Bugzilla 41044, I've been thinking about using the PropertyMakers to our advantage here. Make all Property constructors protected, so only the Makers can access them in their make() methods. Nowhere outside the properties package should any class ever instantiate a NumberProperty (note: not even the PropertyParser, that should use a NumberPropertyMaker to get to an instance) FlyWeightFactories? Good idea! I think we have them already, only, they are not exactly configured yet to flyweight anything. I'm going to make this a priority, together with doing something similar for CommonHyphenation and CommonFont (which could become a candidate if it were only decomposed in font-size + rest of the properties...). Just so you know :) Cheers Andreas