Re: Applying Finn Bock's patch (again) :-)

2005-05-03 Thread Peter B. West
Jeremias Maerki wrote:
So far the most helpful resource for understanding the whole story was
"Digital Typography" by Donald Knuth (chapter 3 only):
http://www.amazon.com/exec/obidos/tg/detail/-/1575860104
The chapter in question (Breaking Paragraphs Into Lines) was originally 
published by Knuth and Plass in "Software - Practice and Experience 11 
(1981), 1119-1184."

Anyone who is interested may be able to obtain a copy of the ogiginal 
paper more cheaply than the book.

Peter
--
Peter B. West 
Folio  


Re: Applying Finn Bock's patch (again) :-)

2005-05-03 Thread Glen Mazza
Thanks for the pointers.  I already got the Digital Typography book a 
few months ago-- I didn't know that Ch. 3 was also helpful for the 
*page* breaking.  I'll look more at that chapter again.

Glen
Jeremias Maerki wrote:
(I'm currently not getting all the posts from the mailing lists. The ISP
started filtering spam 2005-05-01 and is doing a poor job. Sorry if
I miss anything. For example, I didn't get Luca's first mail, only
Glen's answer.)
Glen,
the branch uses Knuth's algorithms for both line and page breaking.
Thankfully, we even have a lot of code sharing between the two aspects.
Michael Plass who wrote that research paper was Knuth's student. His
work is also based on Knuth's work on line breaking. So far I haven't
been able to get a lot out of his paper, although he writes about
footnotes for example, which will eventually come in handy. I get the
impressions that, in certain areas, we're going farther than what was
described in the papers I could find.
So far the most helpful resource for understanding the whole story was
"Digital Typography" by Donald Knuth (chapter 3 only):
http://www.amazon.com/exec/obidos/tg/detail/-/1575860104
Plass' paper on the other side is the only paper that sheds some more light
on page breaking studd, but not much IMO. I've also found stuff that
criticized some of Plass' statements. Anyway, we have to find out
everything the hard way. :-)
(comments on Luca's message below)
 




Re: Applying Finn Bock's patch (again) :-)

2005-05-03 Thread Jeremias Maerki
(I'm currently not getting all the posts from the mailing lists. The ISP
started filtering spam 2005-05-01 and is doing a poor job. Sorry if
I miss anything. For example, I didn't get Luca's first mail, only
Glen's answer.)

Glen,
the branch uses Knuth's algorithms for both line and page breaking.
Thankfully, we even have a lot of code sharing between the two aspects.
Michael Plass who wrote that research paper was Knuth's student. His
work is also based on Knuth's work on line breaking. So far I haven't
been able to get a lot out of his paper, although he writes about
footnotes for example, which will eventually come in handy. I get the
impressions that, in certain areas, we're going farther than what was
described in the papers I could find.

So far the most helpful resource for understanding the whole story was
"Digital Typography" by Donald Knuth (chapter 3 only):
http://www.amazon.com/exec/obidos/tg/detail/-/1575860104

Plass' paper on the other side is the only paper that sheds some more light
on page breaking studd, but not much IMO. I've also found stuff that
criticized some of Plass' statements. Anyway, we have to find out
everything the hard way. :-)

(comments on Luca's message below)

On 03.05.2005 05:29:44 Glen Mazza wrote:
> BTW, is the page breaking also using Knuth's algorithms, or is it from 
> the research paper* that Jeremias ordered a few months back and was 
> mentioning to us?  I have been generically calling both the line- and 
> page-breaking the "Knuth code"--I don't know how correct that is.
> 
> Thanks,
> Glen
> 
> * Also, would it help me to order that research paper--how helpful would 
> it be in understanding our code?
> 
> 
> 
> Luca Furini wrote:
> 
> >I realized just a few days ago that the breaking algorithm (in the
> >BreakingAlgorithm class) is not fully patched with Finn's great
> >refactoring of the Knuth code (bug 32612).
> >
> >I must admit that this is due to my laziness: when I was playing with
> >Knuth's algorithm for page breaking I applied to my local copy of the code
> >only the new restarting strategy, so, although Jeremias applied the patch
> >before the branch, most benefits in performance and readability got lost
> >in the merge.

So it's more my fault than yours. I mostly didn't understand the code
back then and still have troubles understanding it to the last corner.
Anyway, thank you for fixing this!

> >I have now applied the patch to the branch code: it needed some change in
> >order to fit in the new classes, and I hope I did not introduce errors.
> >:-)

I'll see how the code holds up now. I'm still in the middle of
implementing tables. Smoking head the whole time. :-)

> >A few doubts / questions / comments:
> >
> >- the value BestRecord.INFINITE_DEMERITS: I'm not sure it must be
> >+infinity; if it is a finite value it acts like a threshold, reducing the
> >number of active nodes. On the other hand, the finite value should be
> >carefully chosen, otherwise breakpoints with an allowed adjustment ratio
> >could be later discarded because it has more than "finite infinity"
> >demerits (this is something that Finn pointed out some time ago). What
> >about a finite value computed according to the adjustment threshold and
> >the line width?

Some experimenting here certainly can't hurt.

> >- in addition to Finn's restarting strategy, lastTooShort is resetted to
> >null after the creation of new nodes: the newly created nodes are surely
> >better than it, and a "lastTooShort" solution will be found later; it will
> >most likely have more demerits (demerits always increase, when a new line
> >/ page is created), but it will be better anyway.
> >
> >- as now KnuthSequence extends ArrayList instead of LinkedList, a few more
> >optimizations could be done here and there: using get() instead of
> >ListIterators, for example.
> >
> >Regards
> >Luca
> >
> >  
> >



Jeremias Maerki



Re: Applying Finn Bock's patch (again) :-)

2005-05-02 Thread Glen Mazza
BTW, is the page breaking also using Knuth's algorithms, or is it from 
the research paper* that Jeremias ordered a few months back and was 
mentioning to us?  I have been generically calling both the line- and 
page-breaking the "Knuth code"--I don't know how correct that is.

Thanks,
Glen
* Also, would it help me to order that research paper--how helpful would 
it be in understanding our code?


Luca Furini wrote:
I realized just a few days ago that the breaking algorithm (in the
BreakingAlgorithm class) is not fully patched with Finn's great
refactoring of the Knuth code (bug 32612).
I must admit that this is due to my laziness: when I was playing with
Knuth's algorithm for page breaking I applied to my local copy of the code
only the new restarting strategy, so, although Jeremias applied the patch
before the branch, most benefits in performance and readability got lost
in the merge.
I have now applied the patch to the branch code: it needed some change in
order to fit in the new classes, and I hope I did not introduce errors.
:-)
A few doubts / questions / comments:
- the value BestRecord.INFINITE_DEMERITS: I'm not sure it must be
+infinity; if it is a finite value it acts like a threshold, reducing the
number of active nodes. On the other hand, the finite value should be
carefully chosen, otherwise breakpoints with an allowed adjustment ratio
could be later discarded because it has more than "finite infinity"
demerits (this is something that Finn pointed out some time ago). What
about a finite value computed according to the adjustment threshold and
the line width?
- in addition to Finn's restarting strategy, lastTooShort is resetted to
null after the creation of new nodes: the newly created nodes are surely
better than it, and a "lastTooShort" solution will be found later; it will
most likely have more demerits (demerits always increase, when a new line
/ page is created), but it will be better anyway.
- as now KnuthSequence extends ArrayList instead of LinkedList, a few more
optimizations could be done here and there: using get() instead of
ListIterators, for example.
Regards
   Luca