On Aug 31, 2006, at 17:04, Jeremias Maerki wrote:

Hi Jeremias,

I'm investigating what would be necessary to implement hyphenation- keep.
After some thought, I think this is one of those very mean properties
that fire back from page-breaking back into line-breaking. IOW, when you
detect a page/column break at a line which is hyphenated you'll
basically have to track back and redo the line breaking, disabling that
particular hyphenation possibility. You then have to redo the page
breaking possibly having to backtrack again if another hyphenated line
is again at the end of a column/page. Doesn't sound like a small change.

As it happens, I've been looking in the same direction, although not particularly the hyphenation-keep property.

The cheap way, of course, is to add penalty values to discourage page
breaks between hyphenated lines (when hyphenation-keep is activated) but that could lead to ugly layout. It's certainly better to disable certain hyphenation points based on feedback from page breaking but it obviously
means starting to backtrack into line breaking. Maybe the "changing
available IPD" problem also plays into this. As we've seen, it may be
necessary to redo certain line breaks based on events in page breaking.

I've been doing some more browsing in the code and re-read your Wiki page, and I'm getting more convinced that line-breaking should not be made literally 'restartable' to deal with varying ipd between pages. This does NOT mean that we don't need restartable line-breaking at all, only that I think it's not the solution to that particular problem.

In fact, in some cases --if the ipd-change occurs early in the page- sequence-- restarting would be suboptimal, given that line-breaking happens completely independent of page-breaking. Line-breaks for the entire page-sequence, apart from the first few pages, will be invalidated and have to be recreated... and possibly again, upon the next page-break :/

The problem is that, if I interpret correctly, trimmed down to the essence, the main loop now looks like this:

generate first page
create list of line-breaks for the whole page-sequence
while (more line-breaks)
  compute best page-break
  if (more line-breaks)
    generate next page

Strictly speaking, this is total-fit line-breaking only for page- sequences consisting of one page. As to the rest, it only offers guarantees in as much as the page-width remains constant (the first page's ipd).


Does anyone see a relatively simple way I have not yet seen? Or am I
more or less on track?

Depends on how we define simple, but it does address other areas as well.

What I had in mind as a first step, was to detach page-generation from the page-breaking algorithm, such that the PageSequenceLM can set both available bpd and ipd of a LayoutContext before passing it to the FlowLM Another way to look at it: page-breaking would actually become the outer loop, driving the line-breaking to take place in pieces, but as a first step, no more than that.

The PageProvider already caches the pages, so the BreakingAlgorithm would later have to iterate over them (whereas currently, they are created on demand of the PageBreakingAlgorithm, so ipd changes aren't even accessible when computing the line-breaks? Unless by having the LineBreakingAlgorithm ask for the ipd a given page?)

The most straightforward option would be to signal bp-overflow through a flag in the context. Once the line-breaks for a paragraph have been computed, the BlockLM updates the context: indicate bp- overflow at node X (no detailed idea yet on how this is supposed to look, but looking at the related code it doesn't seem too hard)

After getNextKnuthElements() for each BlockLevelLM has been called, the FlowLM can then check for the overflow flag, and if necessary, hand the element-list up to that point over to the PageSequenceLM. If I get the design correctly, it would then be up to the PageBreakingAlgorithm to decide whether the list will be consumed immediately --first-fit-- or whether following lists will be appended before computing any effective page-breaks --total-fit. (This could be made to depend on an extension property of the page-sequence?)

Roughly the loop would come to look like:

while (!flowLM.isFinished())
  generate next page
  update context dimensions
  while (no bp-overflow
          && no forced page-break)
    create next list of line-breaks
    if (first-fit)
      compute best page-break
      add areas
    else
      append to global list

For total-fit, the page-break computations can still be deferred and performed after all the best line-breaks in the page-sequence are known. The only difference being that the global list of line-breaks will already be optimized to take into account ipd changes due to varying page-masters.

The thing I'm still struggling with is the necessary change for this in the LayoutContext:
It seems that, to the line-breaking at least, this should either
a) actually contain a collection of contexts (?) or
b) be made aware of the bp-shifts implied by the line-breaks, so that getRefIPD() would always return the 'current IPD' [= at the implied bp-coordinate for a given node]


Another topic that we may have to address at some point is the
distinction of keeps on column level and keeps on page level. So far, we
can only map the keeps on column level. I wonder how we would go about
an implementation here. It seems to me that the page breaker would have
to start being more clever.

Anyway, the important thing for me right now is to have an idea how
hyphenation-keep would have to be implemented so I can take an estimate
and determine dependencies of tasks.

Well, I already saw possible advantages in what I was investigating for dealing with side- and end-floats. It would be possible, at the time of computing the line-breaks for a float, to determine whether it would by itself already cause an unavoidable bp-overflow (idem dito for before-floats and footnotes: maybe a possible solution to the open issue regarding footnotes and multi-column layout?)

Maybe it could help here too, since info about the 'current' region- body would be accessible to the LineBreakingAlgorithm?

Anyway, I'm guessing that, the programming will become (a little) more complex to follow, but if page-breaking and line-breaking can be made to provide hints to each other, this would solve a lot of open issues.

Hope this gives you some clues.
I haven't made any changes myself yet, only did some information gathering in the source code.


Cheers,

Andreas

Reply via email to