(I'm currently not getting all the posts from the mailing lists. The ISP
started filtering spam 2005-05-01 and is doing a poor job. Sorry if
I miss anything. For example, I didn't get Luca's first mail, only
the branch uses Knuth's algorithms for both line and page breaking.
Thankfully, we even have a lot of code sharing between the two aspects.
Michael Plass who wrote that research paper was Knuth's student. His
work is also based on Knuth's work on line breaking. So far I haven't
been able to get a lot out of his paper, although he writes about
footnotes for example, which will eventually come in handy. I get the
impressions that, in certain areas, we're going farther than what was
described in the papers I could find.
So far the most helpful resource for understanding the whole story was
"Digital Typography" by Donald Knuth (chapter 3 only):
Plass' paper on the other side is the only paper that sheds some more light
on page breaking studd, but not much IMO. I've also found stuff that
criticized some of Plass' statements. Anyway, we have to find out
everything the hard way. :-)
(comments on Luca's message below)
On 03.05.2005 05:29:44 Glen Mazza wrote:
> BTW, is the page breaking also using Knuth's algorithms, or is it from
> the research paper* that Jeremias ordered a few months back and was
> mentioning to us? I have been generically calling both the line- and
> page-breaking the "Knuth code"--I don't know how correct that is.
> * Also, would it help me to order that research paper--how helpful would
> it be in understanding our code?
> Luca Furini wrote:
> >I realized just a few days ago that the breaking algorithm (in the
> >BreakingAlgorithm class) is not fully patched with Finn's great
> >refactoring of the Knuth code (bug 32612).
> >I must admit that this is due to my laziness: when I was playing with
> >Knuth's algorithm for page breaking I applied to my local copy of the code
> >only the new restarting strategy, so, although Jeremias applied the patch
> >before the branch, most benefits in performance and readability got lost
> >in the merge.
So it's more my fault than yours. I mostly didn't understand the code
back then and still have troubles understanding it to the last corner.
Anyway, thank you for fixing this!
> >I have now applied the patch to the branch code: it needed some change in
> >order to fit in the new classes, and I hope I did not introduce errors.
I'll see how the code holds up now. I'm still in the middle of
implementing tables. Smoking head the whole time. :-)
> >A few doubts / questions / comments:
> >- the value BestRecord.INFINITE_DEMERITS: I'm not sure it must be
> >+infinity; if it is a finite value it acts like a threshold, reducing the
> >number of active nodes. On the other hand, the finite value should be
> >carefully chosen, otherwise breakpoints with an allowed adjustment ratio
> >could be later discarded because it has more than "finite infinity"
> >demerits (this is something that Finn pointed out some time ago). What
> >about a finite value computed according to the adjustment threshold and
> >the line width?
Some experimenting here certainly can't hurt.
> >- in addition to Finn's restarting strategy, lastTooShort is resetted to
> >null after the creation of new nodes: the newly created nodes are surely
> >better than it, and a "lastTooShort" solution will be found later; it will
> >most likely have more demerits (demerits always increase, when a new line
> >/ page is created), but it will be better anyway.
> >- as now KnuthSequence extends ArrayList instead of LinkedList, a few more
> >optimizations could be done here and there: using get() instead of
> >ListIterators, for example.
> > Luca