Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-04 Thread Jeremias Maerki
An alternative link to the documents Renaud referred to:
http://www.le-hacker.org/hacks/pagination/

There are a few other quite interesting documents on layout questions,
particularly something about tables which might come in handy later.

On 04.03.2005 02:14:56 Renaud Richardet wrote:
 Vincent Hennebert wrote:
  By looking for this reference, I found the following article:
  www.pi6.fernuni-hagen.de/publ/tr234.pdf
  It's entitled 'On the Pagination of Complex Documents' (actually it's also
  referencing Plass). 
 
 There's another article, where the top level of the algorithm is
 presented (page 8).
 http://www.pi6.fernuni-hagen.de/publ/tr205.pdf
 
 Those 2 articles are summaries of a book. The link to command the book
 can be found at the bottom of
 http://web.informatik.uni-bonn.de/I/research/Pagination/
 
 Renaud



Jeremias Maerki



Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Jeremias Maerki
While looking for material on page breaking I found several references
to this document:

http://wwwlib.umi.com/dissertations/fullcit/8124134

Does anyone know if it's worth ordering and waiting for it? The
unfortunate thing is that they don't seem to have a PDF version that I
could download immediately for a reasonable fee.

Thanks,
Jeremias Maerki



RE: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Victor Mote
Jeremias Maerki wrote:

 While looking for material on page breaking I found several 
 references to this document:
 
 http://wwwlib.umi.com/dissertations/fullcit/8124134
 
 Does anyone know if it's worth ordering and waiting for it? 
 The unfortunate thing is that they don't seem to have a PDF 
 version that I could download immediately for a reasonable fee.

Wow. This looks like it is very valuable. I have ordered it for my own use,
and I'll be glad to give you a book review when it arrives to help you
decide whether it is worthwhile for you or not.

I am especially interested in the summary's comment: For certain simple
badness functions, the pagination problem is NP-complete; Dealing with that
challenge is the likely tricky spot in all of this. My intuition has always
been that the page-breaking problem is much more complicated than the
line-breaking one, partly because lines must be laid out to even think about
page-breaking (and line lengths can change as they move around), partly
because you are effectively working with changes in two dimensions instead
of one, and partly because there seem to me to be a lot more variables in
the problem. I am hoping to find some insight into the detection and
workarounds for the NP-complete situations.

Note that Stanford is Knuth's school, the date year is the same as that of
Chapter 3 of Knuth's Digital Typography, and that the author is the
co-author of that article. It may be possible to infer the same information
from looking at the TeX source code. Also, another source of similar
information would be Volume I of Knuth's Computers and Typesetting, aka
The TeXbook. It is essentially a commentary on TeX, by Knuth. Chapter 15
is entitled How TeX Makes Lines into Pages.

You guys are way ahead of me in terms of thinking about how to implement
this stuff. As you know, my approach has been to leave this stuff for last,
preferring instead to solve the outer-layer problems first, and provide for
multiple implementations that can be improved in parallel. However, I have a
great interest in your efforts, and will be glad to help any way that I can.
And, FWIW, I think you are on the right general track, in this regard at
least.

Victor Mote



Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Jeremias Maerki

On 03.03.2005 16:19:24 Victor Mote wrote:
 Jeremias Maerki wrote:
 
  While looking for material on page breaking I found several 
  references to this document:
  
  http://wwwlib.umi.com/dissertations/fullcit/8124134
  
  Does anyone know if it's worth ordering and waiting for it? 
  The unfortunate thing is that they don't seem to have a PDF 
  version that I could download immediately for a reasonable fee.
 
 Wow. This looks like it is very valuable. I have ordered it for my own use,
 and I'll be glad to give you a book review when it arrives to help you
 decide whether it is worthwhile for you or not.

I will probably order it anyway because I need it yesterday. :-) That's
why I wanted the PDF version which they don't seem to have. Sigh.

 I am especially interested in the summary's comment: For certain simple
 badness functions, the pagination problem is NP-complete; Dealing with that
 challenge is the likely tricky spot in all of this. My intuition has always
 been that the page-breaking problem is much more complicated than the
 line-breaking one, partly because lines must be laid out to even think about
 page-breaking (and line lengths can change as they move around), partly
 because you are effectively working with changes in two dimensions instead
 of one, and partly because there seem to me to be a lot more variables in
 the problem. I am hoping to find some insight into the detection and
 workarounds for the NP-complete situations.

I've found other references to discussion about communication between
line and page breaking algorithms. But that stuff was mostly
overview-style and written in a language I don't understand well:
universitary language. That's probably the first time I regret stopping
university after one semester because I hate mathematics. :-)

 Note that Stanford is Knuth's school, the date year is the same as that of
 Chapter 3 of Knuth's Digital Typography, and that the author is the
 co-author of that article. It may be possible to infer the same information
 from looking at the TeX source code. Also, another source of similar
 information would be Volume I of Knuth's Computers and Typesetting, aka
 The TeXbook. It is essentially a commentary on TeX, by Knuth. Chapter 15
 is entitled How TeX Makes Lines into Pages.

I haven't dared look into the TeX source code, yet, but I've read most
of the chapter you mention. Didn't really help because there are many
many TeX-specific things in there.

 You guys are way ahead of me in terms of thinking about how to implement
 this stuff. As you know, my approach has been to leave this stuff for last,
 preferring instead to solve the outer-layer problems first, and provide for
 multiple implementations that can be improved in parallel. However, I have a
 great interest in your efforts, and will be glad to help any way that I can.
 And, FWIW, I think you are on the right general track, in this regard at
 least.

I very much hope so. But it becomes more and more apparent that this
will be the greatest challenge in my programmer's life. Wow indeed.


Jeremias Maerki



Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Glen Mazza
Happy to see how you have reprioritized your efforts
over the past two months [1], and much, much for the
better.

Glen

--- Jeremias Maerki [EMAIL PROTECTED] wrote:
 
 I very much hope so. But it becomes more and more
 apparent that this
 will be the greatest challenge in my programmer's
 life. Wow indeed.
 
 
 Jeremias Maerki
 
 

[1] http://marc.theaimsgroup.com/?l=fop-devm=110495579414655w=2


Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Jeremias Maerki
Actually, I haven't. The RTF edition idea (which I haven't discarded,
yet, just postponed due to time shortage) is still very high on my
private priority list. As you can guess there's currently another one
which comes first.

On 03.03.2005 18:31:12 Glen Mazza wrote:
 Happy to see how you have reprioritized your efforts
 over the past two months [1], and much, much for the
 better.
 
 Glen
 
 --- Jeremias Maerki [EMAIL PROTECTED] wrote:
  
  I very much hope so. But it becomes more and more
  apparent that this
  will be the greatest challenge in my programmer's
  life. Wow indeed.
  
  
  Jeremias Maerki
  
  
 
 [1] http://marc.theaimsgroup.com/?l=fop-devm=110495579414655w=2



Jeremias Maerki



Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Simon Pepping
On Thu, Mar 03, 2005 at 08:19:24AM -0700, Victor Mote wrote:
 Jeremias Maerki wrote:
 
  While looking for material on page breaking I found several 
  references to this document:
  
  http://wwwlib.umi.com/dissertations/fullcit/8124134
  
  Does anyone know if it's worth ordering and waiting for it? 
  The unfortunate thing is that they don't seem to have a PDF 
  version that I could download immediately for a reasonable fee.
 
 Wow. This looks like it is very valuable. I have ordered it for my own use,
 and I'll be glad to give you a book review when it arrives to help you
 decide whether it is worthwhile for you or not.

I do not know it. It sounds like I should buy it as well.
 
 Note that Stanford is Knuth's school, the date year is the same as that of
 Chapter 3 of Knuth's Digital Typography, and that the author is the
 co-author of that article. It may be possible to infer the same information
 from looking at the TeX source code. Also, another source of similar
 information would be Volume I of Knuth's Computers and Typesetting, aka
 The TeXbook. It is essentially a commentary on TeX, by Knuth. Chapter 15
 is entitled How TeX Makes Lines into Pages.

Note that The TeXbook is TeX's user guide. Yes, Knuth's users are
quite advanced. It was my first book in the direction of computers,
and one of the most inspirational I have read.

The TeX program is described in 'TeX The Program'. That text is weaved
into the program code according to Knuth's literate programming
system. It can be freely extracted from the program code. A TeX
distribution like TeXLive contains the tools to do this. I intend to
do so soon. If you want to do it yourself, TeXLive is available from
the TUG website, www.tug.org. The TeX source code itself is available
from the CTAN repository, but I fear that you have to do some work to
set up all the tools. It is up to yourself to decide whether knowing
TeX's implementation is useful. It is a best-fit algorithm. There is
no look-ahead. For example, TeX is not able to balance two facing
pages (or two columns on a page, which for TeX is the same). I guess
that a dissertation like that cited above contains much more
information than implemented in TeX.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.nl



RE: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Victor Mote
Simon Pepping wrote:

 Note that The TeXbook is TeX's user guide. Yes, Knuth's users 
 are quite advanced. It was my first book in the direction of 
 computers, and one of the most inspirational I have read.
 
 The TeX program is described in 'TeX The Program'. That text 

Oops. You're right. That is volume 2 from the same Computers and
Typesetting series.

 set up all the tools. It is up to yourself to decide whether 
 knowing TeX's implementation is useful. It is a best-fit 
 algorithm. There is no look-ahead. For example, TeX is not 
 able to balance two facing pages (or two columns on a page, 
 which for TeX is the same). I guess that a dissertation like 
 that cited above contains much more information than 
 implemented in TeX.

I'm not sure. The general TeX page-breaking algorithm is discussed in the
paragraphs before Appendix A of Chapter 3 of Digital Typography. The
general box/glue/penalty model is used, but only the current page is
considered. So I think the difference between best-fit and total-fit (as
described here anyway) is the amount of look-ahead itself, not so much the
algorithm. This is why I thought Finn's (IIRC) idea of a variable look-ahead
makes sense. A look-ahead of zero pages is a best-fit, a look-ahead of all
pages is a total-fit. But the algorithm is the same.

Anyway, I agree that the paper is probably the best source, but wanted to
give Jeremias some options.

Victor Mote



RE: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Victor Mote
Victor Mote wrote:

 Oops. You're right. That is volume 2 from the same Computers 
 and Typesetting series.

Er, it is actually volume B.

Victor Mote



Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Vincent Hennebert
Hi Fop Team,
Simon Pepping a écrit :
On Thu, Mar 03, 2005 at 08:19:24AM -0700, Victor Mote wrote:
Jeremias Maerki wrote:
While looking for material on page breaking I found several 
references to this document:

http://wwwlib.umi.com/dissertations/fullcit/8124134
Does anyone know if it's worth ordering and waiting for it? 

By looking for this reference, I found the following article:
www.pi6.fernuni-hagen.de/publ/tr234.pdf
It's entitled 'On the Pagination of Complex Documents' (actually it's also 
referencing Plass). I've read parts of this article and it seems interesting. 
It's well written, quite easy to understand and provides a better algorithm than 
TeX's one. And it's also more recent. It doesn't agree with Plass about the 
NP-hard problem of page layout. Indeed that depends on the formula we are using 
to estimate the badness of a layout. It proposes another formula which seems 
more reasonable and better corresponds to a reader's expectations.
Perhaps it could provide a good basis? I don't have Knuth's 'Digital Typography' 
(I'm considering purchasing it). It may be worthwhile to compare this article 
with what is in the book.

The TeX program is described in 'TeX The Program'. That text is weaved
into the program code according to Knuth's literate programming
system. It can be freely extracted from the program code.

I've already done it, and AFAICT it's not easy to find the way in all the TeX 
stuff (actually nothing is easy with TeX ;-)). I can send you the pdf file if 
you want, feel free to ask me.
Be warned, though: it's a 535 pages (!) document that entirely describes the TeX 
program. I've started looking into it, and, well, it's rather cryptic. It's very 
close to the implementation, it seems to be difficult to get a general idea of 
what it's doing. But I'll investigate a bit more.
And, as Simon wrote, TeX is excellent in line-breaking but not as good in 
page-breaking. It was implemented in the early 80's when memory was expensive.
However, it was written with typographic quality in mind, and that's why it may 
be a good idea to try getting some hints from it.

Vincent


Re: Plass, Michael Frederick: Optimal Pagination Techniques for Automatic Typesetting Systems

2005-03-03 Thread Renaud Richardet
Vincent Hennebert wrote:
 By looking for this reference, I found the following article:
 www.pi6.fernuni-hagen.de/publ/tr234.pdf
 It's entitled 'On the Pagination of Complex Documents' (actually it's also
 referencing Plass). 

There's another article, where the top level of the algorithm is
presented (page 8).
http://www.pi6.fernuni-hagen.de/publ/tr205.pdf

Those 2 articles are summaries of a book. The link to command the book
can be found at the bottom of
http://web.informatik.uni-bonn.de/I/research/Pagination/

Renaud