Date: 2005-02-25T13:50:17 Editor: SimonPepping Wiki: XML Graphics - FOP Wiki Page: PageLayout URL: http://wiki.apache.org/xmlgraphics-fop/PageLayout
no comment Change Log: ------------------------------------------------------------------------------ @@ -1,5 +1,7 @@ = Page layout management = +== Returning, collecting and managing the possible pagebreaks == + This is a plan to achieve page layout management in FOP. With page layout management I mean that the application is able to apply an algorithm that makes the best choice from a number of layout @@ -84,6 +86,26 @@ the caller if the BP is part of the current page. It can do so due to its ordered list of BPs. +Luca Furini +[http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]&msgNo=10549] +pointed to the situation where an FONode has keep-together, which +influences the BPs of its whole subtree. This presents some difficulty +for the above strategy. The property must be propagated down the +subtree of LMs, which must signal it in their BPs so that the PageLM +can take it into account. + +Finn Bock +[http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]&msgNo=10550] +favours returning the BPs up the stack of LMs up to the LM that does +the page breaking, analogous to the LineLM in paragraph layout. It +makes it easier to handle the above case. But it is rather expensive +in processing, also because each LM creates a new BP which wraps the +received BP and which is returned to its parentLM. + +Page breaking was discussed in this email thread: +[http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]&by=thread&from=984205]. +The above two messages are part of it. + == Current situation == LineLM.getNextBreakPoss returns a BP for each line. @@ -99,5 +121,157 @@ child BP to return the areas up to that child BP. A BP has a direct correspondence to a child BP through its member leafPos, which is an index into the list of child BPs. + +== Page breaking strategies == + +The current BP system uses a first-fit strategy. As soon as the page +height is exceeded, the page break is lain at the preceding BP. This +strategy is simple, but does not provide choice. Therefore it is hard, +although not impossible to take keeps into account. + +The next simple strategy is a best-fit strategy. One collects at least +one BP that no longer fits on the page. There may be several possible +page breaks due to stretchability/shrinkability on the page. Then one +weighs the possible page breaking points with a suitable merits and +demerits system, and selects the best page break. This method requires +a better overview over the available break points and +stretchability/shrinkability on the page. This strategy is used by TeX +in its vertical list. + +The next possible strategy uses look ahead. There is a sliding window +of ''N'' pages. The best page breaks are calculated over all pages in the +window, but only the page break of the first page is used. Then the +window is moved forward one page, and the effort is repeated. This +strategy may result in a better placement of floating elements. + +In a variant of the best-fit strategy and of the look-ahead strategy, +one may use the page breaks of ''M'' pages. This is useful for balancing +facing pages (''M''=2, even page/odd page). + +A total-fit strategy, as used in paragraph breaking, can only be +applied to a complete page sequence. This will almost always to +expensive, both in computing effort and in memory requirements. + +== Layout around a page break == + +The layout of a part of the page depends on whether the part is layed +out at a page break or not. For example, the resolution of space +specifiers is different when they occur at the edge of a reference +area or not. When a page break occurs in a table, a footer is +added. The borders of the footer interact with those of the last rows +and cells. + +A BP should always represent the situation which would occur if it +were the selected page break. The following BP makes the calculation +for its predecessor partly undone, because it represents itself as the +selected page break. + +When a page break has been selected, the calculations for the BP after +the selected page break must be revised. They must now take elements +at the start of a new page into account. The resolution of the space +specifiers before the first block is different because it now occurs +at the before edge of a reference area. When a page break occurs in a +table, a header is added. The borders of the header interact with +those of the first rows and cells. + +The paragraph layout mechanism of Knuth has elements, penalties, which +are taken into account when they occur at a line break, but are +ignored when they do not occur at a line break. See especially the use +of a penalty to represent a possible hyphenation point. Other +elements, glue items, are ignored when they occur at a line break, but +are taken into account when they do not occur at a line break. + +It may be useful to do the same for page breaks (see Finn Bock's +ideas). But the situation both at the end and at the start of a page +is more complicated. At the end elements are not only removed, they +may also be added. Similarly at the start of a page. + +== Expressing layout around a pagebreak as Knuth elements == + +We consider the following layout situations. + +=== Space specifiers === + +When the space specifiers resolve to zero around a page break, we are +in the same situation as that of a word space in line breaking. It is +represented by the sequence `box - glue - box`. + +When the space specifiers do not resolve to zero around a page break, +we are in the same situation as that of a word space in line breaking +in the case of centered lines. It is represented by the sequence +{{{ +box - infinite penalty - glue(ha) - zero penalty - glue(hn-ha-hb) - zero width box - infinite penalty - glue(hb) - box +}}} +where ha is the bpd of +the space-after before the page break, hb is the bpd of the +space-before after the page-break, hw is the space when there is no +page break. + +=== Possible page break between content elements === + +Here the most general situation is that when the content is different +with and without page break: + * content Cn when there is no page break, + * content Ca at the end of the page before the page break, + * content Cb at the start of the page after the page break. + +An example of this situation is a page break between table rows: + +{{{ +no page break: page break: + +--------- --------- + row 1 row 1 +--------- --------- + border n border a +--------- --------- + row 2 footer +--------- --------- + page break + --------- + header + --------- + border b + --------- + row 2 + --------- +}}} + +This situation cannot be dealt with using Knuth's box/glue/penalty +model. We introduce two new type of elements, which are a kind of +penalties: + + 1. A penalty with height hn, which is the height of Cn, and with height ha, which is the height of Ca. + 1. An infinite penalty with height hn', which is the height of Cn', and with height hb, which is the height of Cb, and with a flag value of start-page. Here Cn' is the part of Cn which is not represented in the penalty of type 1; normally it is empty. + +Both penalty types differ from a normal penalty in that they also +insert a height when there is no page break. For type 2 that is a +rather theoretical possibility. Penalty type 2 differs from a normal +penalty in that it is not discarded when it occurs at the start of a +new page. Note that in that case it is not the page break itself; the +page break is at a preceding glue or penalty from which it is +separated by glue or normal penalty items, which are discarded at the +start of a page. Both types share some features of a box and of a +penalty; they are a kind of conditional box. + +The above table rows are then represented as: +{{{ +box(row 1) - penalty(hn, ha, 0, 1) - penalty(0, hb, inf, start-page) - box(row 2) +}}} +Without a page break this becomes: +{{{ +box(row 1) - hn (border n) - box(row 2) +}}} +With a page break this becomes: +{{{ +box(row 1) - ha (border a + footer) +hb (header + border b) - box(row 2) +}}} +Here ha and hb include the border-after of the footer and the +border-before of the header. + +Finn Bock launched the idea to use Knuth semantics to express page +breaking. + This plan was written by SimonPepping --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]