Re: svn commit: r360083 - in /xmlgraphics/fop/trunk: ./ src/java/org/apache/fop/fo/ src/java/org/apache/fop/fo/flow/ test/layoutengine/standard-testcases/

Andreas L Delmelle Sat, 31 Dec 2005 08:03:10 -0800

On Dec 31, 2005, at 16:05, Manuel Mall wrote:

[Me:]
Well, it's definitely not impossible, but I'm wondering a bit about
Cost vs. Benefit. Currently, when the trailing spaces for any inline
are treated --in Inline.endOfNode()-- one has no way of knowing
whether any text will still follow --possible subsequent nested
inlines, text or characters will not be available yet.


This indicates to me that your redesigned algorithm has the same flaws

as we currently encounter with the inline layout manager structure.Any

problems which require looking across FO (= LM) boundaries suddenly
become hard. BTW, the original block level whitespace handling
refinement didn't have that problem as it had the whole block content
to available to it. So I still think we have regressed here.

Maybe so... but I'm looking at this as taking a step backwards likeone does before taking a leap.

Besides that, it is not a *flaw* per se. Strictly speaking, white-space collapsing/removal applies to sibling character nodes in thesource document. The fact that leading white-space in a paragraph canbe removed during refinement without any real extra effort is aconvenience, a bonus that follows from the preceding text-nodes orinline-nodes already being processed (= the state indicated by the'inWhiteSpace' and 'afterLinefeed' variables can be carried over).There is no need for look-behind here (the previous algorithm didn'tdo so either).

The possible problem I saw with the block-level white-space handlingwas that all white-space characters would continue to take up memoryuntil the first nested block or in the worst case, until the end-of-block. In case of large blocks with lots of indents due to pretty-printing, the current approach makes these spaces disappear muchsooner (= more memory-efficient).

When I talk about cost/benefit, I refer to the fact that we alreadyget two passes over the same character sequences:

- once when building the FOTree
- another when performing layout

In order to implement this trailing white-space removal for nestedtrailing inlines during refinement --I can't stress it enough: a*purely* aesthetical matter; the conceptual/logical necessity stillescapes me...-- we would have to add a third pass.

In theory, we could keep a reference alive to the last FOText of the
previous inline, so that when it appears at the end of the block, we
could strip its trailing white-space too.


Yes, that is what you get when doing this fo centric. You have to keep
context / state / global variables to deal with "cross border" issues.

Carrying over the context is no problem when it comes to previousnodes, but you simply don't have the luxury of look-ahead in theFOTree --that is, look-ahead is limited to the nodes alreadyavailiable at that point. One way to deal with it is to accumulateall nodes, and only process them at the end-of-block/nested blocks.This has the above mentioned drawback --space characters taking upresources far longer than strictly necessary.

OTOH, look-ahead in the FOTree isn't really required for anything(apart from maybe this particular scenario).The layout algorithm *needs* to be able to move/look in bothdirections anyway, so AFAICT, it shouldn't be too much effort tohandle trailing spaces for trailing nested inlines there... If thatis such a difficult matter, then one should doubt the layout-algorithm, if anything, instead of trying to work around the lack oflook-ahead in the FOTree.

[Me:]
Apart from the aesthetic argument (nice symmetry): why exactly?
Again, IMO, if the right element-sequences are generated for these
white-spaces, they should be suppressed at the end of the paragraph
anyway (forced EOL).


Its not a matter of generating the correct Knuth element sequences
because the algorithm doesn't care about what is at the beginning or

end of a paragraph. Giving the correct (= whitespace handled)paragraph

to the Knuth algorithm is a precondition. Again: line breaking deals
with adding breaks at optimal allowable points within the text it
doesn't care what's at the start and end.

Et voilà, that seems to be where the real *flaw* is located, if youask me. It should care about glues at the beginning of a line --whichit seems to handle perfectly ATM-- regardless of whether it's thefirst line in a paragraph or not. In the same way, it should careabout glues at the end of a line, regardless of whether it is thelast line in a paragraph or not.

Besides that, I get the impression you're somewhat contradictingyourself here:- in the comment on the failing testcase you noted that 'These testsfail because the Knuth element sequences for consecutive whitespaceare not correct.'- and now you're saying that it's not a matter of generating thecorrect element sequences

Can you clarify? Doesn't this indicate that there is a difference inprocessing between the last line in a paragraph and all otherlines... which seems inconsistent. A line is a line is a line, nomatter at what position in the paragraph we find ourselves.



Cheers,

Andreas

Re: svn commit: r360083 - in /xmlgraphics/fop/trunk: ./ src/java/org/apache/fop/fo/ src/java/org/apache/fop/fo/flow/ test/layoutengine/standard-testcases/

Reply via email to