Re: Performance of the intermediate format and the area tree

Chris Bowditch Thu, 22 Nov 2007 09:02:43 -0800

Jeremias Maerki wrote:

<snip/>

That is a simplistic way of looking at at but at least its objectiverather than subjective.



Uhm, with these things you can always manipulate the plus/minus items so
that the calculation results in the desired value. I've seen the bad results
of such an approach. It's VERY difficult to make this objective.


Yes I totally agree.

My personal preferrence is for the firstapproach because it will remain so difficult to write/modify IF XML inapproach 2.
That's why I tend to approach 1 myself. But besides the compatibility
issue it's the only reason. Still, I had to take the step back and look
at alternative approaches.

Approach 2 has 2 nasty negatives, whereas the worst thing about approach1 is the amount of work involved. Perhaps the choice will be obvious ifyou can put some actual time estimates against both approaches. I mean,if approach 2 can be implemented in 25% less time than 1 then I'd pushfor Approach 1, but if Approach 1 requires 400% extra effort thenApproach 2 would become the preferred option.

Regardless of the change:
* Preparation for a structure tree and tagged PDF will have to be done
at some point which has an impact on the area tree, the IF and the
renderers.
* Impact of full writing mode support is still unknown.


In contrast to the above, there's an additional way to increase
performance without much work:
We just make use of modern multi-core CPUs. FOP is mostly
single-threaded. If you look at the CPU usage in a dual-core machine,
you'll see that it will stay at about 50% when rendering. If we do area
tree parsing in the main thread and rendering in another we can do both
at the same time. That's an easy way to decouple the two tasks. There's
also no fine-grained synchronization as it could be done per page which
is coarse enough not to create a performance problem. The only risk I
see is memory consumption as the layout engine or the area tree parser
might be faster to build pages than the renderers can render them. But
if that happens we could probably add a setting that blocks the page
source if the renderer is too far behind. Based on my profiling I would
estimate the performance improvement to be in the area of 50-60%, even
for the non-IF case, on multi-core CPUs. Single core CPUs will probably
not profit but also not suffer.


-1 to this approach for several reasons:

1) Anyone integrating FOP via the API will not be able to plug into aJ2EE container because spawning threads is not allowed in a EJB.



Sorry, but that doesn't count. FOP already violates the J2EE spec today:
- It spawns threads.

That's no good reason to violate it further. I could say FOP hascheckstyle issues, but we when receive patches with checkstyle problemsthey are usually rejected or at least modified before commmit.

AFAICT it is currently only the PropertyCache that spawns new threads.And as you know there is a problem with that which needs addressing. IfI had the time I would be looking at a way to solve the problem in aJ2EE compliant way so I can use FOP in a EJB without resorting to using JCA.

- It accesses local files through java.io.File (although that can be
worked around by proper usage (i.e. URIResolver & Co.)).

Another way of looking at it: its possible to use FOP in a EJB by makingproper use of the API.

- Some plug-ins (most notably Batik) also spawn new threads.

If anyone wants to run FOP in a J2EE-conformant way you have to put it
in a JCA wrapper.

2) Keeping FOP single threaded means the choice to multi-thread is leftwith the API Integrator. It is very easy to get FOP to use the full CPUpower of multi-cores. I have a test app which is only about 100 lines ofcode which starts multiple threads. Each thread pulls documents from a pool.



Yes, that's a common approach. But it doesn't help with print file
generation from intermediate files where you cannot parallelize so
easily if you are not allowed to create multiple print files for one job.
If that were possible we wouldn't necessarily discuss this topic now. Note
that the parallelizing of the rendering could also be made optional, if
someone wants to disable it. At any rate, people running FOP in a
single-document context (documentation for example) could profit from
the speed-up. CPUs became multi-core because it becomes difficult to
make single cores much faster today. At some point, developers have to
adjust to that.

Actually I have a framework to solve that problem too :) Essentiallymost large volume job runs require a grouping of some sort, whether thegrouping relates to sorting algorithms with all jobs for Post Area Xgoing into bag X, or whether it just grouped by 5000 pages because aprinter only has a paper capacity of 5000, there are often ways to breakup the print streams in this scenario too.

Of course, that doesn't help users running FOP from the command line. Iwould expect that most people using the Intermediate Format will bedoing so in a batch processing environment with FOP integrated via theAPI. I can't imagine people actually manually running FOP from thecommand line and tweaking the IF XML manually.

3) Code maintenance becomes a nightmare in a multi threaded application.The synchronization might look simple at first but it quickly becomesv.v.difficult. 3 of us at my company recently spent 3 monthstroubleshooting the bugs out of a multithreaded application which issomewhat smaller and less complex than FOP. This point is re-iterated byAndreas' recent efforts to fix synchronization of the Cache Cleanerthreads in the Property Cache. It looks simple in theory but in practiseturns out to be more trouble than its worth.
That's a valid point. Concurrency with plain Java 1.4 is tricky indeed,
but toolsets like util.concurrent and JSR166 (Java 1.5 java.util.concurrent)
help a lot making this stuff easier.

http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html
http://backport-jsr166.sourceforge.net/

Still sticking to the -1 for that one? :-)


I remain unconvinced.

Sorry,

Chris

Re: Performance of the intermediate format and the area tree

Reply via email to