On 04.05.2005 21:09:00 Andreas L. Delmelle wrote:
> > -----Original Message-----
> > From: Jeremias Maerki [mailto:[EMAIL PROTECTED]
> >
> > On 27.04.2005 18:28:55 Andreas L. Delmelle wrote:
> > <snip />
> > > BTW: Does this problem pose itself only if a single cell
> > > or row spans more than two pages, or also when an entire
> > > row-group does so?
> >
> > Possibly, yes. A row-group in this context is formed because at least
> > one cell has row-spanning. Therefore, you have the possibility that a
> > cell span over more than 2 pages.
> >
> 
> I was just wondering what that would mean for the following, admittedly
> somewhat unrealistic example, more like 'abstract layout art'... Well, a
> sketch anyway :-)
> 
> <fo:table>
>   ...
>   <fo:table-body>
>     <fo:table-row>
>       <fo:table-cell number-rows-spanned="2" column-number="1">
>         <!-- block content -->
>       </fo:table-cell>
>       <fo:table-cell column-number="2">
>         <!-- block content -->
>       </fo:table-cell>
>       <fo:table-cell column-number="3">
>         <!-- block content -->
>       </fo:table-cell>
>     </fo:table-row>
>     <fo:table-row>
>       <fo:table-cell number-rows-spanned="2" column-number="2">
>         <!-- block content -->
>       </fo:table-cell>
>       <fo:table-cell column-number="3">
>         <!-- block content -->
>       </fo:table-cell>
>     </fo:table-row>
>     <fo:table-row>
>       <fo:table-cell column-number="1">
>         <!-- block content -->
>       </fo:table-cell>
>       <fo:table-cell number-rows-spanned="2" column-number="3">
>         <!-- block content -->
>       </fo:table-cell>
>     </fo:table-row>
>     <fo:table-row>
>       <fo:table-cell number-rows-spanned="2" column-number="1">
>         <!-- block content -->
>       </fo:table-cell>
>       <fo:table-cell column-number="2">
>         <!-- block content -->
>       </fo:table-cell>
>     </fo:table-row>
>     <!-- repeat last three of above rows, and add
>          a last row filling the cells in the
>          unoccupied columns -->
>   </fo:table-body>
> </fo:table>
> 
> IIUC, this borderline case would result in the entire table basically being
> one row-group (? In every row there is always one cell spanning from the
> previous row as well as another cell spanning to the next)

Yes. Funny example to run through my row group detector to see if it
works. :-)

> The cells themselves won't necessarily span multiple pages, neither will the
> rows... but it does make a nice demonstration of why the idea of a
> page-break *in* a row or cell should be no surprise. If one wouldn't allow
> this, and a construct like the above would span multiple pages, anything
> other than shrink-to-fit would be impossible, since there is no possible
> page-break that precisely coincides with the table-grid?

It's not about the question if the page break has to match the
table-grid. If you don't use borders and add 4 lines of text in the
spanned cells and 2 lines of text in the non-spanned cells, all with the
same font and size, you get page breaks inside the cells. But in this
special case the table could be broken over more than 2 pages without
the need for recalculating the page breaks. There will be no penalties
produced by the algorithm.

In most cases people will create tables that will have each line (of
text within the cells) carefully aligned which will allow the no-discard
approach, like we have it now, continue to work. But the problem is that
is easy to create an example that will produce a raw penalty (i.e. not
the effPenalty value in the RowBorder/RowBorder2 examples). As soon as
that happens a page break is likely to invalidate the subsequent break
possibilities in the combined element list.

> > > Just asking because, in the latter case, it may turn
> > > out to be less unlikely/more dangerous than it seems
> > > at first glance, albeit only a tiny bit.
> >
> > Well, less likely simply means that it can happen and sooner or later
> > we'll have to deal with it.
> 
> Yep, sure thing. If we don't take care of it now, you can bet your life it
> will turn up in a post on fop-user in less than a month after the eventual
> release...
> 
> (Apologies in advance if the following are silly questions, since I'm not
> yet fully 'at home' in the Knuth breaking code)
> I'm wondering if the elements you're talking about discarding are Really
> Wrong, or whether they are simply A Bit Off --so it would be
> possible/feasible to introduce a 'correction factor' (most likely not a
> primitive datatype, but a tiny nested compound) that increases/is adjusted
> with each page-break occurring during layout of a given sequence? Recycling
> or resetting instead of recreating?

Hmm, I don't see how this could be done. The algorithm may create a
similar combined element list after the break, where the similarity is
likely to be rather big. But the problem are the cases again that jumble
the whole alignment of boxes around and therefore creating a totally
different combined list.

> (By 'discard all elements starting on
> the second page', I take it you don't mean to discard the elements on the
> first page? Because they are Not Wrong, So Far? And when we reached the
> fourth page, only the elements on page three would be discarded, or
> everything after page one?)

Uhm, let me explain differently, by using a list of events:
- first combined list is created
- page breaking is done; the list up to the break point is used for the
first page.
- now the rest of the list can be used on the second page but if the
content doesn't fit you still work with the break possibilities that
were calculated from the viewpoint of the first page, i.e. the break
possibility you'd like to use for break 2-3 was actually created for a
1-2 break case. As I tried to outline the problem is the penalties which
translate more or less into additional whitespace in certain places.

Have another look at the PDF that graphically displays the different
break points. That's what brought me to the fact that the combined list
is only valid for the first break:

http://people.apache.org/~jeremias/fop/KnuthBoxesForTablesWithBorders.pdf

Look closely at the second part of each step. The alignments of the
boxes are sometimes completely different from the original.

> Based on that: Would it be worth the overhead to perform a sort of
> correction-pass over the first elements on a given Nth page before the first
> step --or after the last step for the previous page?

Creative, but too hazy for me to grab. I'm already very much at my limit
just coming up with an improved algorithm for "getNextStep" when borders
and headers/footers have to be taken into account. You should see the
pile of discarded paper on my desk. :-) getNextStep looks easy
graphically, but is quite difficult to translate into an (efficient)
algorithm.

I hope maybe our ingenious Luca might have another hot idea to deal with
this. So far I can only see the brute-force method (discarding element
lists after the break and somehow backtrack while cleverly optimizing so
as not to create too many discarded objects - still a non-trivial thing
over all).

Essentially, this is my one big issue that leaves me wondering whether
the whole Knuth approach is going to work or not.

Jeremias Maerki

Reply via email to