Re: [VOTE] Merge Knuth branch back into HEAD

2005-05-13 Thread Jeremias Maerki

On 12.05.2005 22:00:54 Andreas L. Delmelle wrote:
snip/
 I get carried away sometimes :-)

Happens to me all the time. This stuff gets so complicated.

snip/

  I can see the potential benefit by not having to take all the
  influencing border sources into account, but precalculating some border
  and thus optimizing the code a bit. The beauty of the current approach
  IMO lies within the concentration of the calculation in one spot. I
  think your approach would make the border resolution more decentralized
  and therefor harder to track down in the already complex maze.
 
 Partly agreed. The more I think about the starting and ending GridUnits as
 row-boundaries, the more it seems like much of the logic I saw 'moving up'
 to the row-level would ultimately have to end up in the GridUnit anyway.
 Same for the Body, so, very much like it is now.
 
 Still, I believe we can keep the calculation in one central spot, only split
 it up a bit, and steer the parts of that calculation from above (or below,
 depending on the view), so that certain parts get executed less frequently.

That would be good. I thought about doing something like that but
decided to get the functionality done before going into optimization.

 i.e. something like TableRowIterator.resolveBorders() on the one hand
 finishes the previous row's GridUnit's after-border segments --if any-- and
 triggers preparatory work for the next row's GridUnits' resolveBorders(),
 while the GridUnits at their end do the same for the before-borders for the
 next row (or after-borders of the body/table on breaks), so the next time
 the row-iterator arrives at resolveBorders() etc. --and that last call could
 also be forced from a break-situation, in the middle of a real 'physical'
 row, in order to finish the after-borders of the table/body/footer on the
 break, which is the only situation in which the table and body borders
 become more relevant.
 
 This kind of interaction doesn't strike me as increasing complexity that
 much.

Good, glad to have a hand to help. :-)

 Quite on the contrary, since the resolving of the borders also happens
 at row-level, which seems to be an attractive place to deal with breaks, as
 we should have access to all related border segments in one spot.
 Although I may be missing some very nasty consequences here... :-/

There are a few. I'd appreciate if you would invest the time to
investigate this. The more people know about this, the better.

 I'll
 think it over a bit more first, but IMO, possibly having to decide between 5
 or 6 sets of border-specs for the segments of, say 10 grid-units is making
 matters more complex than
 - rule out 3 or 4 sets once, for all 10 of them
 - decide between 2 sets, one GridUnit at a time
 ...
 - decide between 2 sets,
   or possibly finish up in case of a break, one at a time
 
 One immediate constraint that strikes me is that we would, strictly
 speaking, have no definite values for the border-widths of the after-border
 segments of a row's GridUnits after the first pass, since the border-widths
 for these segments could still be altered by the call to
 TRIter.resolveBorders() that would be made between the current row and the
 next row (or break)... Ultimately, we would only have a full idea on the
 effective settings of a segment after the *last* GridUnit it belongs to has
 called resolveBorders(), or *after* an effective break has triggered
 TRIter.resolveBorders().

yep.

 Can we give the border-resolution a head-start, say create a 'buffer' of
 resolved border-specs for up to five rows ahead of the main layout? Hmm...
 maybe a bit ambitious... Two iterators running synchronously, but the
 'heavy' one only starts after the 'light' one has reached five, from that
 point on, alternate between the two iterators until the first one runs out
 of rows, use up the buffer...? Now, that seems interesting, *if* at all
 manageable of course.

At any rate, you can already look ahead with the TableRowIterator as
much as you like. What it's currently missing is a GC mechanism for the
rows that are not needed anymore. Means we still have potential problems
on long tables.

snip/

   When the first row is finished, we would have two unfinished segments of
   after-borders. Make these available to the next row as preliminary
   before-border segments, which it can then resolve with its own. Next, we
   already know that this row contains three cells, so we need some sort of
   distribution method here --i.e. calculations based on relevant GridUnit
   IPD?-- to give an indication as to which segment(s) need to be
   split into how many parts
 
  Now you lost me. I see a border segment as identical to a GridUnits
  border (not a cell's border). That's the impression I got from the spec.
  Are you talking about handling column spanning here?
 
 
 Sorry, I must have been getting a bit sleepy. Picture the same table
 upside-down to begin with... But indeed, it's much better to just have one
 segment per 

Re: [VOTE] Merge Knuth branch back into HEAD

2005-05-13 Thread Jeremias Maerki

On 13.05.2005 18:01:44 Andreas L. Delmelle wrote:
snip/
  if you would like to take a stab at the collapsed border
  resolution, then please do. I'll leave it aside for the moment and will
  concentrate on implementing or fixing the rest of the important features
  for table layout (BPD/height props, breaks, keeps, etc.).
 
 Will certainly do so. Currently, most of my time was still spent on catching
 up with you guys... Hopefully there will be no more unforeseen circumstances
 that keep me away for a few months, so I can finally get some really
 constructive work done on those 'ideas' of mine...

Wonderful! If I can help in any way, just yell. And don't forget to
remind me if I don't write that nasty example I promised. :-)

 snip /
  I hope I wasn't disrespectful by snipping out and not replying to parts
  of your post.
 
 Well, I wouldn't worry too much about that. I'm rather thick-skinned, if you
 know what I mean...
 
 And if my initial reply to the vote came across as disrespectful to
 you --since you obviously have invested a great deal of your time into that
 algorithm, and I made it seem like it wasn't worth much-- my apologies of
 course!

There was absolutely no problem there.

Jeremias Maerki



Re: [VOTE] Merge Knuth branch back into HEAD

2005-05-11 Thread Jeremias Maerki

On 11.05.2005 00:52:21 Andreas L. Delmelle wrote:
snip/
   Jeremias, what do you mean with complexity in certain areas? Tables
   only, or are there other complexities that you perceived as
   overwhelming?
 
  No, it's mainly the complexity of the collapsed border model ...
 
 Yes, I've been thinking and reading up on that stuff, and somehow it seems a
 bit --a tiny bit-- simpler if you try to figure out
 'collapse-with-precedence' first, since you have to decide on a purely
 numerical basis, so it may facilitate translation into an algorithm. The
 'Eye Catching' question could then be solved as a scenario with fixed
 precedence values for the different styles, plus a factor for the widths,
 etc.

Hmm, I think you got the wrong impression. It's not that I'm having
problems with the border resolution. This actually works fine by now even
if it might need some additional tweaking for calculating new
constellations in break conditions. The design of the resolution is
already prepared to easily handle the precedence variant. It's just a
matter of creating an additional subclass (of CollapsingBorderModel).
The data sources for the decisions are there. The real problem lies
within the effects that borders have on the generated combined elements
list after they have been resolved. I'm sorry for not making that clear
enough. Still(read on below)

 Still, after a look at the code and the Wiki, I had the impression that this
 path hadn't yet been taken into consideration, so hopefully this offers some
 relief...

Hmm, I actually left that away simply because I thought it would be
quite simple. I could be wrong though.

 Starting with the simplest case, a rough description:
 p(table)  p(body)  p(row)  p(column)  p(cell) means
table-border for
  border-start of the first GridUnit in a Row
  border-end of the last GridUnit in a Row
  border-before of all GridUnits in the first Row of a (sub)page
  border-after of all GridUnits in the last Row of a (sub)page
row-border for
  border-before of all GridUnits not in the first Row of a (sub)page
  border-after of all GridUnits not in the last Row of a (sub)page
column-border for
  border-start for all GridUnits except when first in a Row
  border-end for all GridUnits except when last in a Row
body-borders and cell-borders are overruled

I probably don't get what you're targetting at but one thing disturb me
here: you may not have a Row instance.

 Mind the Capitals, and what I have already mentioned in a previous
 post --about doing part of the resolving at row-level-- begins to make a bit
 more sense now. When the BodyLM is initialized, you can already decide
 between 'table' and 'body' borders

(for non-break conditions)

 and pass that result to the RowLM, 

I don't use the RowLM anymore. There's only the TableLM, the
TableContentLM and the CellLM. I know I should have removed the obsolete
LMs by now. I simply was too deep in the mud to notice.

The next best place where the functionality of the RowLM lies is the
TableRowIterator. You'd probably pass this one the result.

 that
 passes that result OR its own border-specs to its GridUnits, and the
 GridUnits ultimately only have to decide between the relevant 'row'-borders,
 'column'-borders and their own... I think one would have a hard time getting
 closer to the meaning of 'collapsing' than this approach.

I can see the potential benefit by not having to take all the
influencing border sources into account, but precalculating some border
and thus optimizing the code a bit. The beauty of the current approach
IMO lies within the concentration of the calculation in one spot. I
think your approach would make the border resolution more decentralized
and therefor harder to track down in the already complex maze.

 What seemed a bit awkward while I was browsing through the relevant code was
 the constant need to pass the 'side' of the GridUnit around when resolving
 the border :-/ Still, that seems more like a consequence of delaying the
 entire border-resolving process until the level of the GridUnit is reached.

constant need? There are four calls to GridUnit.resolveBorder() in the
code, one for each side. There will be a couple of additional ones once
we have figured out how to resolve (or better store) the borders for the
break conditions.

resolveBorder() calls go straight into determineWinner() calls on the
CollapsingBorderModel. It's not that awkward, is it?

 Also, I was juggling with the idea of creating a BorderSegment object that
 operates in conjunction with the GridUnit, but 'in between and over' Rows as
 it were... Instead of having a GridUnit 'resolve its own borders', the
 BorderSegments 'resolve themselves' at the appropriate time. In essence,
 those segments need to know nothing about 'before' or 'after', 'start' or
 'end', they just pick the right border spec from the given set. What gave me
 this idea, was Simon's example, where you need information 

Re: [VOTE] Merge Knuth branch back into HEAD

2005-05-11 Thread Chris Bowditch
Jeremias Maerki wrote:
I'm not where I would like to be, yet (with table layout). Over all,
there is still a number of problems to be solved. These are (potentially
incomplete list):
- Table layout including header, footer, spans and borders (*)
- Markers
- before-floats and footnotes
- keeps and breaks on tables
- strength values for keeps
- the other known table-related problems as documented on the Wiki
- change of available IPD and BPD between pages
- last-page
- column-spanning and column balancing
I just tried running a sample FO that contained markers and got a nasty error. 
Are they broken due to the changes for Knuth page breaking. Do you anticipate 
any pain in fixing them?

snip/
My vote:
At this point I'm only able to give a +0.95 where the missing 0.05 is
due to the fact that the Knuth approach has given me headache after
headache. There are pros and cons to the whole approach. I still cannot
fully exclude the possibility that we're not going to hit a dead end.
And I'm still not comfortable with the complexity in certain areas,
although you could probably say that it would be similarly complex with
the old approach. Anyway, I've gotten used to thinking in terms of boxes,
glue and penalties. Were it not for tables, my vote would have been
clearer.
I understand why you are not 100% sure on this vote. However, I still believe 
we are making progress. Im not convinced the Knuth approach leads to a dead 
end. So heres my +1.

I understand peoples concerns on performance. I fully expect it to be slow 
once we get it working. I believe we should start looking for optimizations 
and time saving ideas once we have a solution that is working for most 
scenarios. If we try to make optimisations now, then they will be undone once 
we implement the missing features.

Chris


[VOTE] Merge Knuth branch back into HEAD

2005-05-10 Thread Jeremias Maerki
I'm not where I would like to be, yet (with table layout). Over all,
there is still a number of problems to be solved. These are (potentially
incomplete list):

- Table layout including header, footer, spans and borders (*)
- Markers
- before-floats and footnotes
- keeps and breaks on tables
- strength values for keeps
- the other known table-related problems as documented on the Wiki
- change of available IPD and BPD between pages
- last-page
- column-spanning and column balancing

(*) ATM I've got the basic algorithm but I'm stuck with the many details
that arise from the collapsing border model. I'm going to back off from
this for now and instead I'm going to try and at least make the separate
border model work. This model doesn't have these nasty interactions
between cells that keep my head spinning. Painting this stuff on paper
is hard enough, implementing it is even harder.

Still, we're at a point where we should finally say yes or no to further
pursuing the new page breaking approach. Merging the branch back into
HEAD means a step back for a few features and on the other side a step
forward especially for keeps. I got the impression that the team is
pretty much committed to continue on this path and this vote should
confirm that.

My vote:
At this point I'm only able to give a +0.95 where the missing 0.05 is
due to the fact that the Knuth approach has given me headache after
headache. There are pros and cons to the whole approach. I still cannot
fully exclude the possibility that we're not going to hit a dead end.
And I'm still not comfortable with the complexity in certain areas,
although you could probably say that it would be similarly complex with
the old approach. Anyway, I've gotten used to thinking in terms of boxes,
glue and penalties. Were it not for tables, my vote would have been
clearer.

Jeremias Maerki



Re: [VOTE] Merge Knuth branch back into HEAD

2005-05-10 Thread Jeremias Maerki

On 10.05.2005 20:41:19 Simon Pepping wrote:
 My worry with the new approach is performance: We know that the
 algorithms require quite some computational steps, but we have no idea
 whether in the end performance on a large document will be acceptable
 or not. (Perhaps Luca has some experimental evidence from his own
 implementation?)

I still have some performance comparisons on my todo list as preparation
for the ApacheCon session. I can run the examples through the new code
to get an idea. That's a no-brainer with my API wrapper. I'll keep you
posted.

 Jeremias, what do you mean with complexity in certain areas? Tables
 only, or are there other complexities that you perceived as
 overwhelming?

No, it's mainly the complexity of the collapsed border model plus the
implications from row spanning and if you go further: handling
min/opt/max stuff which I dared to simply ignore. There are so many
possible interactions. Take the RowBorder2 example. It took me a whole
day to run on paper. And it's still not covering all the possible
cases. If you remove the column span in the header and do some nasty
stuff with the border withs you can create real mean examples. I intend
to write one when I'm in a better mood.


Jeremias Maerki



RE: [VOTE] Merge Knuth branch back into HEAD

2005-05-10 Thread Andreas L. Delmelle
 -Original Message-
 From: Jeremias Maerki [mailto:[EMAIL PROTECTED]

 On 10.05.2005 20:41:19 Simon Pepping wrote:


Hi guys,

For starters: my vote is +1.

I agree with Simon, and also very much feel like we're on the right track
with this. Sure, it will *still* take some work...

snip /
  Jeremias, what do you mean with complexity in certain areas? Tables
  only, or are there other complexities that you perceived as
  overwhelming?

 No, it's mainly the complexity of the collapsed border model ...

Yes, I've been thinking and reading up on that stuff, and somehow it seems a
bit --a tiny bit-- simpler if you try to figure out
'collapse-with-precedence' first, since you have to decide on a purely
numerical basis, so it may facilitate translation into an algorithm. The
'Eye Catching' question could then be solved as a scenario with fixed
precedence values for the different styles, plus a factor for the widths,
etc.

Still, after a look at the code and the Wiki, I had the impression that this
path hadn't yet been taken into consideration, so hopefully this offers some
relief...

Starting with the simplest case, a rough description:
p(table)  p(body)  p(row)  p(column)  p(cell) means
   table-border for
 border-start of the first GridUnit in a Row
 border-end of the last GridUnit in a Row
 border-before of all GridUnits in the first Row of a (sub)page
 border-after of all GridUnits in the last Row of a (sub)page
   row-border for
 border-before of all GridUnits not in the first Row of a (sub)page
 border-after of all GridUnits not in the last Row of a (sub)page
   column-border for
 border-start for all GridUnits except when first in a Row
 border-end for all GridUnits except when last in a Row
   body-borders and cell-borders are overruled

Mind the Capitals, and what I have already mentioned in a previous
post --about doing part of the resolving at row-level-- begins to make a bit
more sense now. When the BodyLM is initialized, you can already decide
between 'table' and 'body' borders and pass that result to the RowLM, that
passes that result OR its own border-specs to its GridUnits, and the
GridUnits ultimately only have to decide between the relevant 'row'-borders,
'column'-borders and their own... I think one would have a hard time getting
closer to the meaning of 'collapsing' than this approach.

What seemed a bit awkward while I was browsing through the relevant code was
the constant need to pass the 'side' of the GridUnit around when resolving
the border :-/ Still, that seems more like a consequence of delaying the
entire border-resolving process until the level of the GridUnit is reached.

Also, I was juggling with the idea of creating a BorderSegment object that
operates in conjunction with the GridUnit, but 'in between and over' Rows as
it were... Instead of having a GridUnit 'resolve its own borders', the
BorderSegments 'resolve themselves' at the appropriate time. In essence,
those segments need to know nothing about 'before' or 'after', 'start' or
'end', they just pick the right border spec from the given set. What gave me
this idea, was Simon's example, where you need information about the
GridUnits for the full two rows --to know how many segments there are, how
they are distributed and which sets of border-specs are relevant for each of
the segments.

When the first row is finished, we would have two unfinished segments of
after-borders. Make these available to the next row as preliminary
before-border segments, which it can then resolve with its own. Next, we
already know that this row contains three cells, so we need some sort of
distribution method here --i.e. calculations based on relevant GridUnit
IPD?-- to give an indication as to which segment(s) need to be split into
how many parts

Then again, it seems only *really* necessary for before- and after-borders.
The border-specs for the vertical border segments could be made available to
a GridUnit through the Column (? via the Row's column list: end-border of
previous GridUnit = the resolved start-border of the current GridUnit's
Column --Or am I thinking too linear --too LRTB, maybe?)

In theory --here I go again...-- it would then be the BorderSegments that
need information on the border specs on Table/Body/Row/(Column?)/Cell for at
most two cells at the same time. I don't know if, in practice, this idea
would save much compared to what you currently have... but it somehow seems
attractive, especially in combination with the approach of resolving in
different stages.


Hope this helps! :-)

Cheers,

Andreas



Re: [VOTE] Merge Knuth branch back into HEAD

2005-05-10 Thread Peter B. West
Jeremias Maerki wrote:
Still, we're at a point where we should finally say yes or no to further
pursuing the new page breaking approach. Merging the branch back into
HEAD means a step back for a few features and on the other side a step
forward especially for keeps. I got the impression that the team is
pretty much committed to continue on this path and this vote should
confirm that.
The team has made remarkable progress in this.  My congratulations. 
From the outside, I share the reservations expressed by Jeremias and 
Simon.  It will be an extremely impressive achievement if they are all 
resolved.

Peter
--
Peter B. West http://cv.pbw.id.au/
Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/