On Jan 11, 2007, at 22:31, J.Pietschmann wrote:
Quite some time ago I did some statistics on number of children
of FOs, using the FOP examples and FO files from bug reports.
The breakdown was roughly the following
~50% no children, mostly FOText nodes and FOs like region-body
and page-number-citation
~40% one child, mostly blocks and inlines (fo:wrapper) having
exactly one FOText node as child
<10% 2..10 children
<<1% more than 10 children, mostly fo:flow, table and table-body
and a few blocks, usually wrapping other blocks.
Real world documents with more tables and inline formatting might
have more multi-child FOs.
Interesting figures...
I haven't checked whether FOText still inherits the children field
on trunk. If so, it is certainly a good idea to get rid of this
(in the maintenance branch, this had widespread implications).
The case of exactly one child might be worth optimizing too.
This was indeed altered, I don't know when, and by whom precisely
(Glen or Finn, IIRC).
Anyways, the hierarchy is currently:
FONode
|->FOText
|->FObj
and only FObj has a protected childNodes instance member, which is a
generic ArrayList (and as I hinted, they are all created with: new
java.util.ArrayList(), which defaults to an initial backing Object[10]).
A FONode only holds the reference to the parent in the tree, and the
FObj.childNodes list is only created when FObj.addChildNode() is
called, so if there is no child element, this reference will always
be null.
Two possible solutions:
A) all FO node implement a FOContainer interface, for example
FONode childAt(int)
int numberOfChildren()
where FOText for example would hardcode return values of null and 0.
B) Use a FOChildrenIterator interface with specific implementations
for FO nodes which can have none or exactly one child.
I was already thinking along the lines of creating a subclass of
Vector, or an implementation of List, but I'm beginning to wonder if
it wouldn't be worth it to create a link between the children...?
Instead of holding a reference to an ArrayList, each FObj would have
three references: parent, firstChild, nextSibling. Add that
FOContainer interface or a FONodeIterator to navigate through it...
could work out. 8)
The benefit would be that the average size of an FObj becomes much
more transparent: three references, the properties and a handful of
private helpers.
Furthermore, in the maintenance branch most of the more specific
FOs copied children from the generic children list into properly
typed fields before starting layout, in many cases the generic
children list could have been deleted afterwards if this wouldn't
have broken a few generic recursive algorithms like the one adjusting
available space due to footnotes. The discussion then had Keiron said
he'd even get rid of the generic children list in favour for properly
typed fields, thereby giving up some flexibility needed for
extensions.
In the trunk, extensions are treated very differently from what I can
tell. Mainly thanks to Jeremias, IIC, who made extensive changes when
implementing XMP metadata. They are stored in a separate collection
(ExtensionAttachments), which adds its own flexibility, but also
difficulty. I'm still not sure how one would have to write an
arbitrary fo extension that is supposed to influence the flow of the
layout algorithm without needing access to the deeper layout API.
Maybe we'd need a sort of generic ExtensionLayoutManager, too...?
Andreas