DO NOT REPLY [Bug 51052] NullPointerException with retrieve marker

bugzilla Sat, 16 Apr 2011 13:06:50 -0700

https://issues.apache.org/bugzilla/show_bug.cgi?id=51052


--- Comment #8 from Andreas L. Delmelle <[email protected]> 2011-04-16 
16:06:21 EDT ---
(In reply to comment #6)
> > Well, I guess that would depend on how this was implemented. If we were 
> > being
> > puritanical, one could argue that if FOText was an object representation of
> > #PCDATA (which I'm pretty sure it is), then by creating a #PCDATA child in 
> > the
> > FOTree, we are creating an invalid node. 

> No, we're not. #PCDATA is always a valid child node for a marker (i.e. the
> content model for a marker is "(#PCDATA|%inline;|%block;)*"). 
> It will only, potentially, /become/ invalid in the retrieval context.

I suddenly realize this needs more explanation, as there is obviously the
remark about the retrieve-marker's parent...

Looking at it from FOP's perspective, at parse-time (i.e. when the FO tree is
built), there is no way to know when --or even if-- a marker will actually be
retrieved. Granted, we _could_ decide to throw an error if there is even the
smallest probability of a mismatch, but we would never know for certain whether
it would actually cause an error. 
I am far from convinced that this justifies the added computational complexity
of walking up the tree, and checking all static-contents for a retrieve-marker
that _might_ retrieve a particular marker.

What I mean is: it is not incorrect/invalid to create the #PCDATA node as a
child of the marker. However, to concede to your point, it is definitely
incorrect to blindly copy it, and re-bind it to the wrong parent.

(In reply tom comment #7)
> Note: I am not necessarily against this myself. It would be pretty cool,
> actually, if we were to store only the raw FO source of the marker-subtree, in
> a CharBuffer, to be parsed later. At first glance, it could turn out to be
> slightly more efficient in terms of memory footprint. I'd need to see proof to
> be certain, but it might...

... and after some tests, I can see that this is definitely not always so cool.
;-)
A lot depends on the actual structure of the subtree. FO is quite verbose, so
even a small table already costs quite some chars, which does not always weigh
up to simply instantiating the FONodes to store the data.

If we're really serious about further optimization, then in terms of footprint,
the most optimal situation may just be to create a generic MarkerDescendant
node type, and convert those into the proper FONode subclass later, if and when
they are actually retrieved.
That is: as opposed to the current approach of immediately instantiating the
proper type at parse time, and cloning those instances later, when the area
tree is built.

Strictly speaking, in the current process, some space is still taken up by the
unused references for members in flow.Block, flow.Table, FOText... That space
is actually wasted, since the specified properties/attributes are stored in a
Map that is associated with the Marker. 
If we strip the MarkerDescendants to be lean, basic FObj instances, that might
save some in larger documents with a lot of markers, especially if only a
relatively small amount are actually retrieved.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

DO NOT REPLY [Bug 51052] NullPointerException with retrieve marker

Reply via email to