Thanks a lot Luca! This will help me find my way in the code. I keep
your comments in mind for when I better understand the whole issue.
Vincent
Luca Furini a écrit :
Jeremias Maerki wrote:
did you already investigate how footnotes are implemented? Can you say
anything about how similar the problem of footnotes is to before-floats?
Just so you don't have to start from scratch while there may be
something to build upon. After all, the footnotes also contain some
logic to move certain parts to a different page than where anchor is
located.
A few quick comments about the footnote implementation:
1) the FootnoteLM returns only the sequence of elements representing the
inline part (not the footnote-body part); it just adds to the last
(inline) box a reference to the FootnoteBodyLM.
2) the LineLM, after computing the breaks, adds to each (block) box
representing a line the references to the FootnoteBodyLM whose citations
are in that line
3) during the remaining of the element collection phase, these
references are not used (but in the creation of "combined" element
lists, when they should be copied inside the new elements)
4) the PageSequenceLM.PageBreaker.getNextKnuthElements() method, after
receiving all the (block) elements, scans them looking for footnote
information, gets the elements from the referenced FootnoteBodyLM and
puts them in a different list (at the moment a list of lists, but this
is sub-optimal), and from the footnote-separator (in a separate list)
5) these lists are looked at in
PageBreakingAlgorithm.computeDifference(), where we try to add some
footnote content to the "normal" page content using getFootnoteSplit(),
and in computeDemerits(), where some extra demerits are added if we
break a footnote or some footnotes are deferred.
This last point at the moment is performed using many
PageBreakingAlgorithm private variables, which is maybe not the best way
to do it, as we must be very careful about their initialization and
their use, especially when the algorithm restarts. I think that a
"state" object storing these variables could be used to store these
values, and explicitly passed along the methods instead of relying on
the class members, but concerning this I'd like to hear the opinions of
the other committers ...
Insertion of before-floats could be implemented in a similar way, giving
the precedence to the footnote insertion (as it is affected by more
strict constraints).
An important difference between a footnote and a before-float is that
the latter does not have an "inline part", so (if we want to follow the
same pattern) we need to either store the reference inside a
previously-created box or to add some new elements containing the
reference (but we must be sure that these elements cannot be parted from
the previous ones, see the constraints in section 6.10.2 in the spec).
A crucial point is the demerit function as, if I remember correctly, it
greatly affect the computational complexity of the breaking algorithm
(thre should be a M. Plass paper concerning this).
HTH
Another thing that we may need to keep in mind: There was lots of desire
from the user community that FOP supports large documents (long-term
goal, not necessary yours). I wrote that a first-fit algorithm could
help free memory earlier. Obviously, for complex before-float situations
a total-fit approach is probably more interesting as it can come up with
more "creative" solutions. I'm just mentioning it so we keep the bigger
picture in mind and since there could be conflicting goals.
A "first degree" of first-fit algorithm could be achieved quite quickly
by having a BreakingAlgorithm interface which is implemented by a
TotalFitBA (the existing implementation) and a FirstFitBA which would
have a much simpler considerLegalBreak() method that, instead of the
complex set of nodes, just keeps in mind a single node.
This would surely decrease the memory footprint, but is not (I think)
what we really want, as this simplified algorithm would be performed on
the whole sequence of elements.
In order to start processing the sequence as soon as we receive a few
elements we need to do some deeper changes.
An idea (I just had it now, so I did not fully consider all its
implications).
At the moment, the block-level LM collect elements from their children
and return just a single sequence (if there are no break conditions); we
could have a parameter requesting them to return after they receive each
child sub-sequence, and have a canStartComputingBreak() method that
returns true if the sequence contains enough elements and we are using a
first-fit algorithm, or false otherwise ...
Sorry for the long post ... and for the long absence too, but it seems
that just after thinking "great, now I've really got some time to spend
on FOP" I receive tons of other things to do ... :-(
Regards
Luca