Hi Weiqi,

>>In unmodified FOP, input from the source FO document occurs entirely at 
>>the beginning of the FOP run, and output entirely at the end. Therefore, 
>>the largest part of FOP processing occurs entirely in-memory and is 
>>therefore CPU bound. If you run 30 CPU bound threads concurrently on a 
>>machine with less than 30 CPUs then you are going to get degraded 
>That's exactly what we observed.  Unfortunately, in the servlet world,
>multi-threading is the norm.  Not so much to distribute the load as to
>handle concurrent requests.  I simply cannot queue up all the requests
>and process then one at a time.
Why not? It would be pretty easy to make a queue up that processes FOP 
objects serially, blocking the servlet thread until the requested object 
is formatted. With a centralised Queue you could then select the number 
of FOP processors that are processing the entries on the queue. I have a 
nice multithreaded Queue implementation that allows multiple servers on 
a single queue with multiple submitters - it doesn't block the enqueue 
caller, but that wouldn't be too difficult. That way you can have the 
right number of threads for the number of CPUs you have, and you should 
get better overall performance and much less memory usage. Sure, the 
servlet thread(s) will block waiting for processing, but that's what 
happens anyway, right?

I am actually interested in solving this problem because I am using 
XSLT->XHTML in servlets, but I am using a HTML serializer rather than 
FOP to output to browsers.

>The question then becomes, is there room for improvements.  From what
>I've heard so far, the answer is yes.  Replacing inefficient JDK 1.1
>always synchronized data structures, streamline the processing model,
>find ways to reduce object creation and garbage collection, seems to be
>the key.
Agreed, there are plenty of places to improve FOP, definitely in the 
current implementation but also (apparently) in the design. I've just 
finished (I hope!) the first of a couple of iterations that should 
improve the throughput and memory use of FOP significantly. I hope to 
send patches and JARs out in a few hours, but there are plenty of places 
where things could be improved. Just one example is the FONode which 
always allocates a Vector, even if it has no children... there are 
plenty of these things - but that's OK. It's much more important to get 
things broadly correct than to be totally optimised, especially at this 
early stage.

>> When you say 6, 30 and 12 "fold", I assume you mean times?
>Yes.  But I have to prefix that with 'uncontrolled unscientific
>unofficial pseudo-benchmark on a home machine'.  Don't read too much
>into it.
Sure, I wasn't reading too much into it, just replying. But a 30-"fold" 
time increase from 2 seconds is about 34 years. :-) I'm just a pedant I 
think, everyone these days seems to say "fold" when they mean "times". 
It's like saying one "could care less" when one actually couldn't. :-)

>Following suggestions from another post in the thread, we tried Saxon in
>place of Xalan, and achieved noticeable speedups.  And yes, Saxon is
>picky!  I'll share some numbers later.

>It raises a question about the suitability of using FOP in a servlet
>environment.  We certainly learned what is and is not achievable with
>today's FOP.  And we'll regulate its use in a way that won't flatline
>the servers.
I believe that there are going to be processing speed limitations in 
FOP, and (IMO) XML in general, for the next while. For example, most 
software (including FOP, I think) seems to process XML by comparisons 
with tags, lookups in dictionaries, etc etc. While this is lovely, what 
we end up with is this whole chain of inefficient processing - from XSL 
to FO to PDF - where everything is dealt with as Strings, with the 
management and resource usage that entails (regardless of language). It 
would be ideal if, for example, an XSLT processor could say to FOP, "OK, 
instead of sending you the string '<fo:block>', I'm going to send you 
the integer 1" - this double-handling of everything slows XML processing 
down significantly, and this means that there are going to limitations 
on how fast it's possible to make FOP go. In reality, I reckon that XML 
needs to be preprocessed before computers get to it, kinda like using 
lex before sending stuff off to yacc. But rather than whinge I should 
just do something about it in my "spare time" :-)

Anyway, all of the above are just the demented ramblings of someone who 
wishes that there was a way of efficiently encoding XML documents for 
processing. I love XML, but to summarize, I think FOP's problem is 
largely XML, not FOP.

>>I'll volunteer to test.  Email me at [EMAIL PROTECTED]
Great. I'm going to have a look at another phase of optimisation before 
I send you something because I suspect that all the renderers are 
working a particular way and, if so, then I may as well take advantage 
of it now - which will reduce the memory footprint even more, if I am 
right. But I have just got the code into a submittable form, and 
up-to-date w.r.t. CVS, so I expect to have somthing to send tonight, 
regardless. I'll probably just post a URL a bit later.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to