On Oct 18, 2007, at 19:23, Vincent Hennebert wrote:

<snip />
OTOH, the above is semantically equivalent to (I think we had already
established that there should not be a double page-break here)

<fo:block break-before="page">
  <fo:block>
    <fo:block>

If the LMs would be guaranteed to receive the 'normalized' form, the
break-condition can be tested for internally by the outer LM itself. No need to look forward or back... The first descendants wouldn't even need
to check for breaks anymore.

I think I see your point. Basically you’re proposing a push method (a LM notifies its parent LM that it has a break-before) while mine is a pull
method (a LM asks its children LMs if they have break-before).

Yep, although it would not be the LM but rather the FO that pushes the break-before upwards to its parent if it is also the first child. The LMs would largely continue to work as they do now, except that under a certain set of conditions, they don't need to check the outside anymore: only take into account the forced break on its own FO. If there is none, then no need to recursively check for first descendants having forced breaks.

Currently (sorry if it becomes boring to stress this) the construction of the layout-tree starts only when the end-of-page- sequence event occurs. I still see room for changing this in the future, and so I need to consider the effects on the layout-algorithm as well: the algorithm will, for instance, no longer be able to rely on *all* childLMs being available the first time it enters the loop... The last childLM in an iteration might turn out to be not-the- last-one-after-all. For many following FONodes, the LMs do not exist yet at that point. Not in my head, at least. ;-)

You’re
more at the FO tree building stage, I’m more at the layout stage. In
terms of efficiency I think both methods are equivalent as the same
amount of method calls will be performed in either way.

Right, but OTOH... it's more a matter of /when/ (in the process) that happens.

The push method might be slighty more complicated to implement in
special cases like tables: when an fo:cell notifies its parent
fo:table-body that it has a break-before, the table-body must figure out
if the cell lies in the first row or not.

Almost everything is /slightly/ more complicated in case of fo:tables, especially those without explicit fo:table-rows or - columns. ;-)

Anyway, I remember that when I implemented implicit column-numbers, I also gave TableBody an instance member to check whether we are adding cells in the first row or not, so this particular case would be easily addressed. (Checking... yep, it's still there.)

Come to think of tables, I'd consider 'propagation' in terms of pushing a forced break on a cell to the first cell in the row. In the table-layout code, at the point where we have a reference to the row or the first cell in a row, we would immediately know whether there is a forced break on a first descendant in any of the following sibling cells without having to request the corresponding childLMs and trigger a tree-traversal of who-knows-how-many levels.

Keeping in mind the above mentioned idea of triggering layout sooner, if we can guarantee that the layoutengine always receives complete rows, then the table-layout job should become a bit simpler in the general use-case, while still not adding much complexity in trickier, more exotic cases, like:
//table-cell/block[position() > [EMAIL PROTECTED]'page']

especially where the cell's column-number corresponds to the highest column-number.

Triggering layout sooner is the only way we are ever going to get FOP to accept arbitrarily large tables, without consuming massive amounts of heap. A 'simple' grid of 5 x 500 cells generates +5000 FONodes (table-cells must have at least one block each) that stay in memory until the page-sequence is completely finished. I wonder how many break-possibilities that generates... :/


A matter of taste, probably, but I think I’d prefer the pull method: the LM performs requests to the appropriate children LMs exactly when and if
needed.

The only thing an LM should initially pull/request from its children, AFAIU, is a list of elements, given a certain LayoutContext. When composing its own element list, an LM should ideally be able to rely on the lists it receives from its children. Then add/delete/ update elements and (un)wrap, depending on context that is unknown or irrelevant to the child.

That may simplify code as well (and improve its readability) as
some form of pull method is necessary anyway (the
mustKeepWithPrevious/WithNext/Together methods).

Keeps are a different story indeed. Big difference is that keeps have strengths, and breaks do not.

Consider:

<fo:block id="b1">
  ...
  <fo:block id="b2">
    <fo:block id="b3" keep-with-previous.within-page="...">
      <fo:block id="b4">
        <fo:block id="b5" break-before="page">

This may be interpretation: you cannot specify a 'strength' for a break. It is either there or not. I take this to mean that a forced break overrules any keep.

Main advantage to the layoutengine would be that forced breaks are known as early as possible: the break is either there, on the FO, when the LM is initialized --propagated upwards from a first child, maybe seven or eight levels down--, or it is not. The above can be normalized at parse time, with only a marginal cost, so that the break is propagated upwards to block b2, and the keep is suppressed before any LM is even created.

I believe you already mentioned this idea of normalizing/ simplifying the FO tree in the past. Note that it may exist in parallel as it addresses
a different general issue. One concern I’d have is to make sure that
a simplification leads to a semantically equivalent result.

That is precisely the purpose of normalization: to remove ambiguities at a point where it is still relatively simple. Ambiguities that would otherwise cause a significant amount of checks or tree- or list- traversals later on to get every possible scenario right. (FWIW: XEP also normalizes the input FO, but there it happens by means of an XSLT; IIRC, they normalize tables to always have columns and rows, for example; implicit column-numbers can also quite easily be computed/assigned as part of an XSL Transform)

Given the complexity of the spec that might be difficult to establish. Not sure also if the overhead is compensated by the gain in the further processes
(layout, area tree generation). But that’s a different topic.

The key advantage in the longer term is that the start of those further processes can be triggered sooner, without adding too much complexity to the related source code.

Agreed with the concerns, but I'm wondering if these portions of code, instead of extracting them into a separate class, could be centralized
in, say, BlockStackingLM and InlineStackingLM...?

I thought of that, but a separate class looked cleaner to me for some
reasons:
- the LMs classes are already overcrowded with many different concerns

True.

- the code would be about the same for Block- and InlineStackingLM
- we could factorize it into a common super-class


AbstractStackingLM...?

I kind of like the idea. For the really shared portions, AbstractStackingLM could then implement a set of static methods.

but both those classes
  have subclasses to which breaks don’t apply (Flow-, StaticContentLM,
  for example).

I wouldn't really see this as a problem. The related methods will never be called, unless there is a flaw in our logic[*]. To stress the fact that they serve no purpose there, we could add overrides that always return false.

[*] (They won't be called, precisely because breaks don't apply?)

OTOH keeps apply to AbstractGraphicsLM which doesn’t
  inherit any of those classes.

That's a special case, since in principle a graphic does not itself consist of more layout-objects that need to be stacked. To the layoutengine, a graphic is simply a monolithic box. Graphics are inline by definition nonetheless, so it could be InlineStackingLM with the same reservations as for FlowLM and StaticContentLM, but for other methods (the actual 'inline-stacking' can be considered to be delegated to the producer of the graphic, here).



Cheers

Andreas

Reply via email to