Hi all,

after noticing that the layout tree pruning in the prototype [1] was improving performance I decided to implement it in FOP trunk to see how it was behaving in the real world. The concept I've applied to the general breaking algorithm can be defined as a "last-n-lines fit" algorithm; if n=5, when the algorithm find the first node for the sixth line it choose the final active node for the first line that have the best active node as child (this is a correction from the prototype implementation). Other nodes for line 1 gets discarded with relative children. When the algorithm determine a line break for line 7 the pruning is performed for line 2. This is equivalent to say that line 1 is chosen as the best in the context of lines 1-5, line 2 in the context of lines 2-6, and so on. While this algorithm won't give you the best possible layout (well, not always, depending on n value) as total fit do, it can improve performance and shrink memory usage by reducing the number of active node in some cases. Anyway it gets "nice" layouts, surely better than best fit, and by choosing the value of 'n' we can set the trade off between performance and quality. In the tests I've done I saw a sensible improvement when two conditions were satisfied: 1) the layout constraints was leading to an high number of active nodes (short lines, hyphenation enabled, non justified alignment); 2) the paragraph was long enough to produce many lines (if n is greater than the total number lines the pruning doesn't even get activated).

As worst case I considered an A4 page with a single column with justified text split in many paragraphs: no performance improvements. Even if I process a single paragraph that is 8 page long the processing time is the same as if pruning was disabled, while the layout result is slightly different (but not bad). As best case I considered an A4 page with two columns, hyphenation enabled, left aligning for a single paragraph 8 page long. Here the performance boost is embarrassing:

         | cpu t | mem
---------+-------+--------
No prune | 2'20" | 570 MB
---------+-------+--------
n = 30   | 0'12" | 35 MB
---------+-------+--------
n = 1    | 0'07" | 35 MB

Well, this is a really extreme test that unlikely will represent a use case, let's look at more tests:

 Just | 1 col | 1 par || no pru | n=30 |  n=1
------+-------+-------++--------+------+------
  X   |   X   |   X   ||    6"7 |  6"6 |  6"6
------+-------+-------++--------+------+------
  X   |   X   |       ||    6"6 |  6"6 |  6"7
------+-------+-------++--------+------+------
  X   |       |   X   ||   14"4 |  7"1 |  6"9
------+-------+-------++--------+------+------
  X   |       |       ||    7"9 |  7"3 |  6"9
------+-------+-------++--------+------+------
      |   X   |   X   ||   32"6 |  9"0 |  6"7
------+-------+-------++--------+------+------
      |   X   |       ||    9"9 |  9"0 |  6"6
------+-------+-------++--------+------+------
      |       |   X   || 2'20"2 | 12"0 |  7"0
------+-------+-------++--------+------+------
      |       |       ||   21"0 | 13"0 |  7"0

The test was done with the same text modified as follows: justified or left aligned, 1 or 2 columns, within a single block or split in 9 blocks of the same size.

I've attached the patch to be applied to trunk: to enable pruning you can set the system property from the command line with -DtreeDepth=1 or any other number you choose for 'n'. If you don't define treeDepth the pruning doesn't get enabled. I've also attached the test fo files so you can check the resulting layout for the tests I reported above.


Dario


[1] 
http://mail-archives.apache.org/mod_mbox/xmlgraphics-fop-dev/200810.mbox/[EMAIL 
PROTECTED]

Attachment: pruneTrunk2.diff
Description: Binary data

Attachment: fo-test.tar.bz2
Description: Binary data

Reply via email to