Hi all, My 2 cents, as I don't have a detailed understanding yet of all the issues raised here.
Globally we agree that the line- and page-breaking algorithm should be modified to exchange some information. Now I have the feeling that backtracking is actually not necessary. I've put some words about that in my GSoC wiki page about side-floats. There are also Simon's comments in a preceding thread [1]. Line-breaking could be somehow driven by page-breaking, both being done in the same time. At each iteration of line-breaking, the currently considered page context may be passed to the line-breaking algorithm; roughly: create a node for page 0, line 0 for each legal linebreak do for each active node do considerLegalBreak(linebreak, page context of the active node) if this is a feasible line break record an active node, line level if this is a feasible page break record an active node, page level In considerLegalBreak, we would have all the necessary informations from the current page-level active node: - would this line be the last line of the page/column? Then if the current legal linebreak is a hyphen it doesn't make a feasible linebreak - this would be the last line of the last column and there is a keep-together.within-page? Then no feasible page-break Depending on the current active node, a legal linebreak could be the last line of page n, in which case the ipd of page n is to be considered; or it could be the first line of page n+1, and then we must take the ipd of page n+1. We could modulate the degree of total-fit we want: for a real true total-fit we keep the active-nodes for the whole document. Or each time the end of a page-sequence is reached, we stop the algorithm, chose the current best layout (and can start creating the areas), and restart from scratch at the next page sequence. Or we do that each time a forced page-break is met. We could choose to reset the line-level active nodes at the end of each paragraph, and choose the number of lines leading to the optimal layout for that paragraph (this is the current situation). Or, instead, just select the best active nodes for each possible number of lines, and discard the other; so there would usually be three active nodes for a paragraph instead of one currently. We could, each time a feasible page break is found, record it only if its demerits are lower than those of the currently recorded page break for the same number of pages (page-level best-fit). We could also do that for paragraphs (line-level best-fit, for very simple documents). So, in my opinion, and with the still limited knowledge I have of some layout problems (balanced columns, several spans for a page...), this should be just a matter of passing the right informations to the line-breaking algorithm, and record them in active nodes. Hope that can give you further ideas, Vincent [1] http://mail-archives.apache.org/mod_mbox/xmlgraphics-fop-dev/200608.mbox/[EMAIL PROTECTED] 2006/8/31, Jeremias Maerki:
I'm investigating what would be necessary to implement hyphenation-keep. After some thought, I think this is one of those very mean properties that fire back from page-breaking back into line-breaking. IOW, when you detect a page/column break at a line which is hyphenated you'll basically have to track back and redo the line breaking, disabling that particular hyphenation possibility. You then have to redo the page breaking possibly having to backtrack again if another hyphenated line is again at the end of a column/page. Doesn't sound like a small change. The cheap way, of course, is to add penalty values to discourage page breaks between hyphenated lines (when hyphenation-keep is activated) but that could lead to ugly layout. It's certainly better to disable certain hyphenation points based on feedback from page breaking but it obviously means starting to backtrack into line breaking. Maybe the "changing available IPD" problem also plays into this. As we've seen, it may be necessary to redo certain line breaks based on events in page breaking. Does anyone see a relatively simple way I have not yet seen? Or am I more or less on track? Another topic that we may have to adress at some point is the distinction of keeps on column level and keeps on page level. So far, we can only map the keeps on column level. I wonder how we would go about an implementation here. It seems to me that the page breaker would have to start being more clever. Anyway, the important thing for me right now is to have an idea how hyphenation-keep would have to be implemented so I can take an estimate and determine dependencies of tasks. Thanks for any ideas, Jeremias Maerki