Thanks a lot Luca! This will help me find my way in the code. I keep
your comments in mind for when I better understand the whole issue.

Vincent


Luca Furini a écrit :
Jeremias Maerki wrote:

did you already investigate how footnotes are implemented? Can you say
anything about how similar the problem of footnotes is to before-floats?
Just so you don't have to start from scratch while there may be
something to build upon. After all, the footnotes also contain some
logic to move certain parts to a different page than where anchor is
located.


A few quick comments about the footnote implementation:

1) the FootnoteLM returns only the sequence of elements representing the inline part (not the footnote-body part); it just adds to the last (inline) box a reference to the FootnoteBodyLM.

2) the LineLM, after computing the breaks, adds to each (block) box representing a line the references to the FootnoteBodyLM whose citations are in that line

3) during the remaining of the element collection phase, these references are not used (but in the creation of "combined" element lists, when they should be copied inside the new elements)

4) the PageSequenceLM.PageBreaker.getNextKnuthElements() method, after receiving all the (block) elements, scans them looking for footnote information, gets the elements from the referenced FootnoteBodyLM and puts them in a different list (at the moment a list of lists, but this is sub-optimal), and from the footnote-separator (in a separate list)

5) these lists are looked at in PageBreakingAlgorithm.computeDifference(), where we try to add some footnote content to the "normal" page content using getFootnoteSplit(), and in computeDemerits(), where some extra demerits are added if we break a footnote or some footnotes are deferred.

This last point at the moment is performed using many PageBreakingAlgorithm private variables, which is maybe not the best way to do it, as we must be very careful about their initialization and their use, especially when the algorithm restarts. I think that a "state" object storing these variables could be used to store these values, and explicitly passed along the methods instead of relying on the class members, but concerning this I'd like to hear the opinions of the other committers ...

Insertion of before-floats could be implemented in a similar way, giving the precedence to the footnote insertion (as it is affected by more strict constraints).

An important difference between a footnote and a before-float is that the latter does not have an "inline part", so (if we want to follow the same pattern) we need to either store the reference inside a previously-created box or to add some new elements containing the reference (but we must be sure that these elements cannot be parted from the previous ones, see the constraints in section 6.10.2 in the spec).

A crucial point is the demerit function as, if I remember correctly, it greatly affect the computational complexity of the breaking algorithm (thre should be a M. Plass paper concerning this).

HTH

Another thing that we may need to keep in mind: There was lots of desire
from the user community that FOP supports large documents (long-term
goal, not necessary yours). I wrote that a first-fit algorithm could
help free memory earlier. Obviously, for complex before-float situations
a total-fit approach is probably more interesting as it can come up with
more "creative" solutions. I'm just mentioning it so we keep the bigger
picture in mind and since there could be conflicting goals.


A "first degree" of first-fit algorithm could be achieved quite quickly by having a BreakingAlgorithm interface which is implemented by a TotalFitBA (the existing implementation) and a FirstFitBA which would have a much simpler considerLegalBreak() method that, instead of the complex set of nodes, just keeps in mind a single node.

This would surely decrease the memory footprint, but is not (I think) what we really want, as this simplified algorithm would be performed on the whole sequence of elements.

In order to start processing the sequence as soon as we receive a few elements we need to do some deeper changes.

An idea (I just had it now, so I did not fully consider all its implications). At the moment, the block-level LM collect elements from their children and return just a single sequence (if there are no break conditions); we could have a parameter requesting them to return after they receive each child sub-sequence, and have a canStartComputingBreak() method that returns true if the sequence contains enough elements and we are using a first-fit algorithm, or false otherwise ...

Sorry for the long post ... and for the long absence too, but it seems that just after thinking "great, now I've really got some time to spend on FOP" I receive tons of other things to do ... :-(

Regards
    Luca

Reply via email to