(Sorry for the delayed reply...)
On Nov 16, 2005, at 01:15, Manuel Mall wrote:
On Wed, 16 Nov 2005 03:45 am, Jeremias Maerki wrote:
Sounds like a good plan to me. Would you go after that?
Sure thing. For now, I'll restrict it to moving handleWhitespace()
into a separate class, maybe one instance for each Flow/StaticContent
or PageSequence? That instance can then be used by all blocks and
markers, carrying state info down the tree. FTM, leaving the iterator
structure unchanged.
I have no problems with the suggestion to move the white space
handling
from Block into its own class so other fo's that need it can make use
of it.
However, I still need to be convinced that pushing it down to inline
level is actually of benefit.
Maybe it's something aesthetic, I dunno. In theory, whitespace
handling could already be started from the point where you reach the
first nested start-inline event, why wait until the first start-
block? A choice... and so we are forced to recurse because previous
child nodes could contain text themselves.
It simply seems more 'natural' to have each FO handle its own
whitespace, so the higher level FOs only need to see the first/last
characters of any child inline nested between their own FOText-nodes.
BTW: what is the maximum number of characters you need in a sequence
before you can be certain whether a given whitespace should be
removed/converted? The current implementation seems to indicate that
number to be two or three. True enough, that's purely XML whitespace
handling...
But why wait to begin processing until you have, maybe a few hundred
characters? Since *all* whitespace is passed through by the parser,
IMO the sooner you can throw excess space characters away, the
better. Even more so if it's excess fo:character objects.
I am afraid we will end up with the same
problem we now have at LM level, that is text for a paragraph needs to
be analysed across fo boundaries and the current LM structures are
very
much in the way of doing that.
1) Agreed that the LayoutManagers definitely may need more context
than a handful of characters to make sound decisions. Looking for
line-breaks, now there we really need to look across FO boundaries
2) There is no inherent contradiction between handling whitespace at
each block/inline level, and handling whitespace across FO
boundaries. The latter refers more to the net result of the whole
algorithm.
--and so, I wonder...
Whitespace needs to be handled across fo
boundaries as well. The current iterator structure was designed to
exactly facilitate that. It seems to be doing it well and I see no
reason to replace it.
...Hmm. Are the iterators themselves used by layout? AFAICS, that's a
No. Maybe they are in the wrong package ATM? ;-)
IMO, the current iterator structure, in combination with chaining all
those FOText instances together, is something that does need to be
revisited (as in: definitely). Not for an alpha-release, but some
comments in FOText clearly indicate that it was never the intention
of keeping it that way. If I get the timeline correctly, the current
FOText design predates the separation of layout-logic. In the LM-
tree, a BlockLM needs to be able to see all text-nodes of its
descendants, but I don't immediately see a reason why a Block in the
FOTree needs to.
To merge in another part of the thread:
On Nov 17, 2005, at 00:28, Manuel Mall wrote:
On Thu, 17 Nov 2005 03:40 am, Simon Pepping wrote:
linefeed-treatment is a local operation on a single character.
Yes
white-space-collapse does not cross FO boundaries because the spec
limits this to sibling character FOs.
Yes, but
 <fo:character character=" ">
are fo character siblings in the XSL-FO sense but not fop internally.
The suggestion to move white space handling to inline will not cover
this case.
Not in itself, but it would make it simpler to delete the character
node if it is done when processing its parent than it is when doing
so when processing an ancestor X levels up.
WRT whitespace handling during refinement, inlines have more in
common with markers and blocks than with text-nodes, and fo:character
is more to be treated like a text-node than an inline.
Cheers,
Andreas