As a completely unintentional side-effect, it gave me the tools to solve the really critical FOP performance problem on large files. This isn't going to be solved by micro-efficiencies on FO tree storage.

I'm going to use this as a excuse to outline my thinking for handling large input files. I'm almost sure that it is a bit different from your view.

This is a rant, and I don't have a working patch to back it up.

The problem with Keirons layout code (with respect to large input files) is that it works top-down on the LM tree and thus require the creating of the complete LM tree for the page sequence. To better fit within SAX event model the layout process should also be event driven, triggered f.ex by the endBlock() and character() events. That means that the traversing of the FOTree cannot be easily done using recursive decend. Instead the main loop over the FO tree could use a non-recusive tree treverse like this:

        public Element traverse(Node q, Node end, int mode) {
            while (q != end) {
                switch (mode) {
                case FIRST:
                    // First time in the node.
                    if (q.hasChildNodes()) {
                        q = q.getFirstChild();
                    mode = NEXT;
                    // fallthrough
                case NEXT:
                    // All children are handled.

                    // Add the node to its parent.
                    if (q.getParentNode() != null) {
                        child(q.getParentNode(), q);
                    // fallthrough
                case RESTART:
                    // Attempt to move vertically to next sibling.
                    if (q.getNextSibling() != null) {
                        q = q.getNextSibling();
                        mode = FIRST;
                    // No sibling found, move to the parent to
                    // do postorder.
                    q = q.getParentNode();
                    mode = NEXT;
            return null;

This is the algorithm from Knuth TAoCP, vol I, Page 323 program S. It traverses any DOM tree fragment from 'q' to 'end' without using the java stack. The tree must support getFirstChild() and getNextSibling(). During the traverse it fires 'preorder', 'postorder' and 'child' events.

When applied to FOP the nodes are LayoutManagers and getFirstChild() and getNextSibling() will create the LayoutManager for the first child / next sibling.

The loop can be stopped when we temporary run out of FO tree nodes and restarted again when new nodes has been added. I suppose that the FO tree can then be viewed as a stream of FO nodes.


Reply via email to