(Sorry for the delayed reply...)

On Nov 16, 2005, at 01:15, Manuel Mall wrote:

On Wed, 16 Nov 2005 03:45 am, Jeremias Maerki wrote:
Sounds like a good plan to me. Would you go after that?

Sure thing. For now, I'll restrict it to moving handleWhitespace() into a separate class, maybe one instance for each Flow/StaticContent or PageSequence? That instance can then be used by all blocks and markers, carrying state info down the tree. FTM, leaving the iterator structure unchanged.

I have no problems with the suggestion to move the white space handling
from Block into its own class so other fo's that need it can make use
of it.

However, I still need to be convinced that pushing it down to inline
level is actually of benefit.

Maybe it's something aesthetic, I dunno. In theory, whitespace handling could already be started from the point where you reach the first nested start-inline event, why wait until the first start- block? A choice... and so we are forced to recurse because previous child nodes could contain text themselves. It simply seems more 'natural' to have each FO handle its own whitespace, so the higher level FOs only need to see the first/last characters of any child inline nested between their own FOText-nodes. BTW: what is the maximum number of characters you need in a sequence before you can be certain whether a given whitespace should be removed/converted? The current implementation seems to indicate that number to be two or three. True enough, that's purely XML whitespace handling... But why wait to begin processing until you have, maybe a few hundred characters? Since *all* whitespace is passed through by the parser, IMO the sooner you can throw excess space characters away, the better. Even more so if it's excess fo:character objects.

I am afraid we will end up with the same
problem we now have at LM level, that is text for a paragraph needs to
be analysed across fo boundaries and the current LM structures are very
much in the way of doing that.

1) Agreed that the LayoutManagers definitely may need more context than a handful of characters to make sound decisions. Looking for line-breaks, now there we really need to look across FO boundaries 2) There is no inherent contradiction between handling whitespace at each block/inline level, and handling whitespace across FO boundaries. The latter refers more to the net result of the whole algorithm.

--and so, I wonder...

Whitespace needs to be handled across fo
boundaries as well. The current iterator structure was designed to
exactly facilitate that. It seems to be doing it well and I see no
reason to replace it.

...Hmm. Are the iterators themselves used by layout? AFAICS, that's a No. Maybe they are in the wrong package ATM? ;-) IMO, the current iterator structure, in combination with chaining all those FOText instances together, is something that does need to be revisited (as in: definitely). Not for an alpha-release, but some comments in FOText clearly indicate that it was never the intention of keeping it that way. If I get the timeline correctly, the current FOText design predates the separation of layout-logic. In the LM- tree, a BlockLM needs to be able to see all text-nodes of its descendants, but I don't immediately see a reason why a Block in the FOTree needs to.

To merge in another part of the thread:

On Nov 17, 2005, at 00:28, Manuel Mall wrote:
On Thu, 17 Nov 2005 03:40 am, Simon Pepping wrote:

linefeed-treatment is a local operation on a single character.


white-space-collapse does not cross FO boundaries because the spec
limits this to sibling character FOs.

Yes, but

&#x20<fo:character character=" ">

are fo character siblings in the XSL-FO sense but not fop internally.
The suggestion to move white space handling to inline will not cover
this case.

Not in itself, but it would make it simpler to delete the character node if it is done when processing its parent than it is when doing so when processing an ancestor X levels up. WRT whitespace handling during refinement, inlines have more in common with markers and blocks than with text-nodes, and fo:character is more to be treated like a text-node than an inline.



Reply via email to