On Nov 1, 2005, at 10:04, Manuel Mall wrote:


I am sure it is doable - but is it worth it at this stage? Possibly
after a better understanding of the white-space handling issues that
whole current system needs revision? One problem with the current char
iterator is that it iterates over inline boundaries which causes white
space to be collapsed across those which according to the clarification
of the WG is incorrect. IMO to implement the refinement step of the
white space handling (which currently happens in the flow.Block object)
we need an iterator which goes through all characters but indicates fo
boundaries (not including fo:characters) so we can do:
a) linefeed treatment across all characters;
b) white space collapse across each consecutive section of
implicit/explicit fo:characters, i.e. delimited by the start/end of
fo's;
c 1) white-space-treatment from the start of the fo:block to the first
non white-space character;
The iterator must also be able to either operate backwards or be able to
be reset to a particular position (last non white space character) so
we can do:
c 2)  white-space-treatment from the end of the fo:block backwards to
the first non white-space character

It must also support character deletions and character substitutions.

Does that make sense?

Very much. Precisely with that in mind, I've also been contemplating moving part of the whitespace-handling to inline-level. This would keep the nested inlines separated from the Block's own direct FOText descendants (and at the same time, in combination with the modification I already described, this would provide us with an opportunity to remove fo:characters from within the nested inlines -- which would become quite a pain if this removal is deferred to block- level)

So the RecursiveCharIterator should only create Iterators over regular FOText or fo:characters that are direct descendants of the Block/Inline. FOText of nested FObjs should be left alone, since the whitespace will already be collapsed. IOW, it should stop being -- recursive?

Currently, whitespace handling is triggered from the moment a Block encounters a child node that isn't FOText nor generates inline areas. At the basis this seems OK, the only difference I'd propose is that inlines do their own whitespace handling, so that *if* whitespace needs to be collapsed across fo boundaries --maybe there are cases?--, the block-level only needs to look at the first and last characters in an inline's text.


Cheers,

Andreas

Reply via email to