On Jan 5, 2006, at 18:48, Andreas L Delmelle wrote:

<snip />

To summarize this thread (it has taken long enough :-))

I thought it over a bit more, and what I'm currently working on (and will most likely finish during the weekend) is the following:

1) Basically keep the algorithm the way I recently altered it, but containing some additional processing for trailing inline FOs that end with a sequence of white-space. Determining this last bit is easy enough, since it just means that XMLWhiteSpaceHandler.inWhiteSpace will be false after handleWhiteSpace(). At the end of the block, we will do one more pass over all those trailing inlines, if any. IMO, in the vast majority of use-cases there will be either zero, one or at most two of those, but theoretically this could be any number... If there are any, then if white-space-collapse has the default value of "true" there will be only one trailing white-space character left at that point, so this additional bit of processing will cost virtually nothing.

2) Simplify the CharIterator structure, in the sense that we'll still only need an iterator over FOText and Characters. Unless layout needs access to the iterators, I think charIterator() can be pushed down to be specific to FObjMixed, and then the overrides of this method can be removed from all other FOs apart from FOText and Character. For 1), it could turn out handy if I add the possibility to iterate backwards until the last non-white-space is encountered...

3) Exclude markers (and their descendants) from white-space handling during refinement, for the mentioned reasons: * retrieve-marker's ancestor's white-space properties govern the treatment in this case * possibly page-break context is needed when dealing with alternating static-contents
  * retrieve-markers with retrieve-boundary="document"

3) of course means the recently enabled marker_bug.xml testcase will have to be disabled again until we find a way to tackle this in layout. I had thought of using XMLWhiteSpaceHandler itself for this, but the tricky part is that, once a Marker (and its descendants) have been white-space-treated, the stripped white-space is permanently gone, and since that same Marker can again be retrieved in a different context etc.

[end-of-thread, I hope ;-)]

Cheers,

Andreas

Reply via email to