Re: Page breaking [was: Markers added to the wrong page]

2005-02-08 Thread Finn Bock
[The Web Maestro]
I don't know how the spec deals with this, but I doubt the 
spec cares which algorithm is used. That said, would it be a 
good idea to determine which algorithm to use based on 
something in userconfig.xml or something? If the Knuth system 
is more 'expensive' in terms of resources, we could make 
non-Knuth the default, and enable Knuth via a config file.
Yes indeed. But without measurements I would guess that knuth page 
breaking would be far less expensive than the knuth line breaking. The 
line breaking has to deal with far more elements and because of that a 
far larger set of active nodes.

But clearly a choice will be good, both for page and for line breaking.
[Victor]
The good news is that this is an excellent idea. The bad news is that FOP's
determination to not even tolerate more than one algorithm is what drove FOP
and FOray apart. So you have stumbled in to a somewhat sensitive area, which
might explain the absence of responses (so far).
BTW, since it is relevant here, thanks for your kind comments last week in
another thread. You are ever the peacemaker, a very useful role. As long as
the differences remain as great as they are, we *need* more than one FO
implementation. In the meantime, I am much more useful and productive
outside of FOP than I am in it, and, the FOP developers probably are also.
To my perhaps still naive mind, layout consists of exactly two tasks: 1)
finding line-breaks within logical blocks (nested blocks consisting of a new
logical block), and 2) finding page-breaks between and within logical
blocks. All other tasks can be conveniently handled within the two data
structures, not being dependent on any strategy. The Knuth-style algorithm
is by definition a total-fit algorithm. The scope of such an algorithm for
the first task is the logical block itself. The scope of the second task is
the page-sequence. So, to implement a Knuth-style algorithm (itself a good
idea) for item 2, requires the ability to see an entire page-sequence in the
AreaTree at one time.
Not at all. It is a rather trivial change to knuth to pick a page break 
when there is N pages of lookahead.

If we assume that finished page knuth element arrive one at time to the 
KnuthPage algorithm, the main loop becomes:

addKnuthElement(element) {
if element.isBox():
totalWidth += element.width
else if element.isGlue():
if (previous element is box):
considerLegalBreak()
totalWidth += element.width
totalStretch += element.stretch
totalShrink += element.shrink
else if element.isPenalty():
if element.penalty  INFINITE:
considerLegalBreak()
if activeList.startLine  currentPage + lookahead:
// In the activeList, find the node with the best demerit.
bestNode = findBestDemerit()
// Remove all nodes in activeList which does not have bestNode
// in their parentage.
clearActiveList(bestNode)
makePageBreak(bestNode)
Here it is only clearActiveList() which does not have a fast 
implementation when using the usual implementation of ActiveNode. It 
will require a scan over all the active node and for each node a scan 
back thru the previous links to the node at currentPage+1.

clearActiveList(bestNode):
for node in activeList:
// jump back thru previous (node.page-currentPage) times.
while i++  node.page - currentPage
previous = node.previous
if previous != bestNode:
activeList.remove(node)

The rest of the methods called are similar to the methods in my knuth 
line breaking patch (which is quite similar the current implementation).

My own insecurity comes from figuring out which penalty values and 
demerit algorihtm to use to get keeps and orphans right.

regards,
finn


RE: Page breaking [was: Markers added to the wrong page]

2005-02-08 Thread Victor Mote
Finn Bock wrote:

 Not at all. It is a rather trivial change to knuth to pick a 
 page break when there is N pages of lookahead.
 
 If we assume that finished page knuth element arrive one at 
 time to the KnuthPage algorithm, the main loop becomes:
 
 addKnuthElement(element) {
  if element.isBox():
  totalWidth += element.width
  else if element.isGlue():
  if (previous element is box):
  considerLegalBreak()
  totalWidth += element.width
  totalStretch += element.stretch
  totalShrink += element.shrink
  else if element.isPenalty():
  if element.penalty  INFINITE:
  considerLegalBreak()
  if activeList.startLine  currentPage + lookahead:
  // In the activeList, find the node with the best demerit.
  bestNode = findBestDemerit()
  // Remove all nodes in activeList which does not 
 have bestNode
  // in their parentage.
  clearActiveList(bestNode)
   makePageBreak(bestNode)
 
 Here it is only clearActiveList() which does not have a fast 
 implementation when using the usual implementation of 
 ActiveNode. It will require a scan over all the active node 
 and for each node a scan back thru the previous links to the 
 node at currentPage+1.
 
 clearActiveList(bestNode):
  for node in activeList:
  // jump back thru previous (node.page-currentPage) times.
  while i++  node.page - currentPage
  previous = node.previous
  if previous != bestNode:
  activeList.remove(node)
   
 
 The rest of the methods called are similar to the methods in 
 my knuth line breaking patch (which is quite similar the 
 current implementation).

Hmmm. I guess it depends on where the N in N pages of lookahead comes
from. If that can be set to include the entire page-sequence, then you can
call it a Knuth-style algorithm. Otherwise, I think you have something less,
which is OK for those who don't want the performance hit.

 My own insecurity comes from figuring out which penalty values and 
 demerit algorihtm to use to get keeps and orphans right.

Yes. Fortunately, the XSL-FO standard provides some guidance here, but I
agree that this part will be a challenge.

Good luck. I am glad to see you guys heading down this path. Let me know if
there is anything I can do to help.

Victor Mote



Re: Page breaking [was: Markers added to the wrong page]

2005-02-08 Thread Luca Furini
Finn Bock wrote:

 I would pass the element on the immidiate parent, which recursively
 passes them on the top-level LM in the direction. For inline, the
 toplevel would be LineLM and for blocks it would be the PageLM.

Ok, I misunderstood what you wrote, now I think we were saying the same
thing using different words! :-)

Regards
Luca





FOP at ApacheCon Europe 2005?

2005-02-08 Thread Jeremias Maerki
Most of you will probably have heard that ApacheCon Europe will be happening in
July. I think it would be great if FOP would somehow be visible there.
There's a call for participation ending 2005-03-04. Any ideas?

Who's planning to go to Stuttgart? I won't miss it! It would be cool to
meet some of you there, especially those I haven't met personally or on
the phone yet. Also, it might be a great opportunity for a little
hackathon.

Jeremias Maerki