Re: Implementation of hyphenation-keep property

Jeremias Maerki Thu, 31 Aug 2006 11:59:52 -0700

Wow, I have to digest this first. I have a busy month behind me with not
much of my brain allocated to FOP. But thanks so far for the feedback.
What I can deduct from this is that my suspicion is probably correct
that implementating hyphenation-keep will be quite tricky with the
current code. I assume we have to do a few changes to make page- und
line-breaking interact more closely (for "changing available IPD" etc.).


On 31.08.2006 17:57:14 Andreas L Delmelle wrote:
> On Aug 31, 2006, at 17:04, Jeremias Maerki wrote:
> 
> Hi Jeremias,
> 
> > I'm investigating what would be necessary to implement hyphenation- 
> > keep.
> > After some thought, I think this is one of those very mean properties
> > that fire back from page-breaking back into line-breaking. IOW,  
> > when you
> > detect a page/column break at a line which is hyphenated you'll
> > basically have to track back and redo the line breaking, disabling  
> > that
> > particular hyphenation possibility. You then have to redo the page
> > breaking possibly having to backtrack again if another hyphenated line
> > is again at the end of a column/page. Doesn't sound like a small  
> > change.
> 
> As it happens, I've been looking in the same direction, although not  
> particularly the hyphenation-keep property.
> 
> > The cheap way, of course, is to add penalty values to discourage page
> > breaks between hyphenated lines (when hyphenation-keep is  
> > activated) but
> > that could lead to ugly layout. It's certainly better to disable  
> > certain
> > hyphenation points based on feedback from page breaking but it  
> > obviously
> > means starting to backtrack into line breaking. Maybe the "changing
> > available IPD" problem also plays into this. As we've seen, it may be
> > necessary to redo certain line breaks based on events in page  
> > breaking.
> 
> I've been doing some more browsing in the code and re-read your Wiki  
> page, and I'm getting more convinced that line-breaking should not be  
> made literally 'restartable' to deal with varying ipd between pages.  
> This does NOT mean that we don't need restartable line-breaking at  
> all, only that I think it's not the solution to that particular problem.
> 
> In fact, in some cases --if the ipd-change occurs early in the page- 
> sequence-- restarting would be suboptimal, given that line-breaking  
> happens completely independent of page-breaking. Line-breaks for the  
> entire page-sequence, apart from the first few pages, will be  
> invalidated and have to be recreated... and possibly again, upon the  
> next page-break :/
> 
> The problem is that, if I interpret correctly, trimmed down to the  
> essence, the main loop now looks like this:
> 
> generate first page
> create list of line-breaks for the whole page-sequence
> while (more line-breaks)
>    compute best page-break
>    if (more line-breaks)
>      generate next page
> 
> Strictly speaking, this is total-fit line-breaking only for page- 
> sequences consisting of one page. As to the rest, it only offers  
> guarantees in as much as the page-width remains constant (the first  
> page's ipd).
> 
> >
> > Does anyone see a relatively simple way I have not yet seen? Or am I
> > more or less on track?
> 
> Depends on how we define simple, but it does address other areas as  
> well.
> 
> What I had in mind as a first step, was to detach page-generation  
> from the page-breaking algorithm, such that the PageSequenceLM can  
> set both available bpd and ipd of a LayoutContext before passing it  
> to the FlowLM
> Another way to look at it: page-breaking would actually become the  
> outer loop, driving the line-breaking to take place in pieces, but as  
> a first step, no more than that.
> 
> The PageProvider already caches the pages, so the BreakingAlgorithm  
> would later have to iterate over them (whereas currently, they are  
> created on demand of the PageBreakingAlgorithm, so ipd changes aren't  
> even accessible when computing the line-breaks? Unless by having the  
> LineBreakingAlgorithm ask for the ipd a given page?)
> 
> The most straightforward option would be to signal bp-overflow  
> through a flag in the context. Once the line-breaks for a paragraph  
> have been computed, the BlockLM updates the context: indicate bp- 
> overflow at node X (no detailed idea yet on how this is supposed to  
> look, but looking at the related code it doesn't seem too hard)
> 
> After getNextKnuthElements() for each BlockLevelLM has been called,  
> the FlowLM can then check for the overflow flag, and if necessary,  
> hand the element-list up to that point over to the PageSequenceLM. If  
> I get the design correctly, it would then be up to the  
> PageBreakingAlgorithm to decide whether the list will be consumed  
> immediately --first-fit-- or whether following lists will be appended  
> before computing any effective page-breaks --total-fit. (This could  
> be made to depend on an extension property of the page-sequence?)
> 
> Roughly the loop would come to look like:
> 
> while (!flowLM.isFinished())
>    generate next page
>    update context dimensions
>    while (no bp-overflow
>            && no forced page-break)
>      create next list of line-breaks
>      if (first-fit)
>        compute best page-break
>        add areas
>      else
>        append to global list
> 
> For total-fit, the page-break computations can still be deferred and  
> performed after all the best line-breaks in the page-sequence are  
> known. The only difference being that  the global list of line-breaks  
> will already be optimized to take into account ipd changes due to  
> varying page-masters.
> 
> The thing I'm still struggling with is the necessary change for this  
> in the LayoutContext:
> It seems that, to the line-breaking at least, this should either
> a) actually contain a collection of contexts (?) or
> b) be made aware of the bp-shifts implied by the line-breaks, so that  
> getRefIPD() would always return the 'current IPD' [= at the implied  
> bp-coordinate for a given node]
> 
> >
> > Another topic that we may have to address at some point is the
> > distinction of keeps on column level and keeps on page level. So  
> > far, we
> > can only map the keeps on column level. I wonder how we would go about
> > an implementation here. It seems to me that the page breaker would  
> > have
> > to start being more clever.
> 
> > Anyway, the important thing for me right now is to have an idea how
> > hyphenation-keep would have to be implemented so I can take an  
> > estimate
> > and determine dependencies of tasks.
> 
> Well, I already saw possible advantages in what I was investigating  
> for dealing with side- and end-floats. It would be possible, at the  
> time of computing the line-breaks for a float, to determine whether  
> it would by itself already cause an unavoidable bp-overflow (idem  
> dito for before-floats and footnotes: maybe a possible solution to  
> the open issue regarding footnotes and multi-column layout?)
> 
> Maybe it could help here too, since info about the 'current' region- 
> body would be  accessible to the LineBreakingAlgorithm?
> 
> Anyway, I'm guessing that, the programming will become (a little)  
> more complex to follow, but if page-breaking and line-breaking can be  
> made to provide hints to each other, this would solve a lot of open  
> issues.
> 
> Hope this gives you some clues.
> I haven't made any changes myself yet, only did some information  
> gathering in the source code.
> 
> 
> Cheers,
> 
> Andreas



Jeremias Maerki

Re: Implementation of hyphenation-keep property

Reply via email to