Re: [VOTE] Merge Knuth branch back into HEAD
On 12.05.2005 22:00:54 Andreas L. Delmelle wrote: snip/ I get carried away sometimes :-) Happens to me all the time. This stuff gets so complicated. snip/ I can see the potential benefit by not having to take all the influencing border sources into account, but precalculating some border and thus optimizing the code a bit. The beauty of the current approach IMO lies within the concentration of the calculation in one spot. I think your approach would make the border resolution more decentralized and therefor harder to track down in the already complex maze. Partly agreed. The more I think about the starting and ending GridUnits as row-boundaries, the more it seems like much of the logic I saw 'moving up' to the row-level would ultimately have to end up in the GridUnit anyway. Same for the Body, so, very much like it is now. Still, I believe we can keep the calculation in one central spot, only split it up a bit, and steer the parts of that calculation from above (or below, depending on the view), so that certain parts get executed less frequently. That would be good. I thought about doing something like that but decided to get the functionality done before going into optimization. i.e. something like TableRowIterator.resolveBorders() on the one hand finishes the previous row's GridUnit's after-border segments --if any-- and triggers preparatory work for the next row's GridUnits' resolveBorders(), while the GridUnits at their end do the same for the before-borders for the next row (or after-borders of the body/table on breaks), so the next time the row-iterator arrives at resolveBorders() etc. --and that last call could also be forced from a break-situation, in the middle of a real 'physical' row, in order to finish the after-borders of the table/body/footer on the break, which is the only situation in which the table and body borders become more relevant. This kind of interaction doesn't strike me as increasing complexity that much. Good, glad to have a hand to help. :-) Quite on the contrary, since the resolving of the borders also happens at row-level, which seems to be an attractive place to deal with breaks, as we should have access to all related border segments in one spot. Although I may be missing some very nasty consequences here... :-/ There are a few. I'd appreciate if you would invest the time to investigate this. The more people know about this, the better. I'll think it over a bit more first, but IMO, possibly having to decide between 5 or 6 sets of border-specs for the segments of, say 10 grid-units is making matters more complex than - rule out 3 or 4 sets once, for all 10 of them - decide between 2 sets, one GridUnit at a time ... - decide between 2 sets, or possibly finish up in case of a break, one at a time One immediate constraint that strikes me is that we would, strictly speaking, have no definite values for the border-widths of the after-border segments of a row's GridUnits after the first pass, since the border-widths for these segments could still be altered by the call to TRIter.resolveBorders() that would be made between the current row and the next row (or break)... Ultimately, we would only have a full idea on the effective settings of a segment after the *last* GridUnit it belongs to has called resolveBorders(), or *after* an effective break has triggered TRIter.resolveBorders(). yep. Can we give the border-resolution a head-start, say create a 'buffer' of resolved border-specs for up to five rows ahead of the main layout? Hmm... maybe a bit ambitious... Two iterators running synchronously, but the 'heavy' one only starts after the 'light' one has reached five, from that point on, alternate between the two iterators until the first one runs out of rows, use up the buffer...? Now, that seems interesting, *if* at all manageable of course. At any rate, you can already look ahead with the TableRowIterator as much as you like. What it's currently missing is a GC mechanism for the rows that are not needed anymore. Means we still have potential problems on long tables. snip/ When the first row is finished, we would have two unfinished segments of after-borders. Make these available to the next row as preliminary before-border segments, which it can then resolve with its own. Next, we already know that this row contains three cells, so we need some sort of distribution method here --i.e. calculations based on relevant GridUnit IPD?-- to give an indication as to which segment(s) need to be split into how many parts Now you lost me. I see a border segment as identical to a GridUnits border (not a cell's border). That's the impression I got from the spec. Are you talking about handling column spanning here? Sorry, I must have been getting a bit sleepy. Picture the same table upside-down to begin with... But indeed, it's much better to just have one segment per
Re: [VOTE] Merge Knuth branch back into HEAD
On 13.05.2005 18:01:44 Andreas L. Delmelle wrote: snip/ if you would like to take a stab at the collapsed border resolution, then please do. I'll leave it aside for the moment and will concentrate on implementing or fixing the rest of the important features for table layout (BPD/height props, breaks, keeps, etc.). Will certainly do so. Currently, most of my time was still spent on catching up with you guys... Hopefully there will be no more unforeseen circumstances that keep me away for a few months, so I can finally get some really constructive work done on those 'ideas' of mine... Wonderful! If I can help in any way, just yell. And don't forget to remind me if I don't write that nasty example I promised. :-) snip / I hope I wasn't disrespectful by snipping out and not replying to parts of your post. Well, I wouldn't worry too much about that. I'm rather thick-skinned, if you know what I mean... And if my initial reply to the vote came across as disrespectful to you --since you obviously have invested a great deal of your time into that algorithm, and I made it seem like it wasn't worth much-- my apologies of course! There was absolutely no problem there. Jeremias Maerki
Re: [VOTE] Merge Knuth branch back into HEAD
On 11.05.2005 00:52:21 Andreas L. Delmelle wrote: snip/ Jeremias, what do you mean with complexity in certain areas? Tables only, or are there other complexities that you perceived as overwhelming? No, it's mainly the complexity of the collapsed border model ... Yes, I've been thinking and reading up on that stuff, and somehow it seems a bit --a tiny bit-- simpler if you try to figure out 'collapse-with-precedence' first, since you have to decide on a purely numerical basis, so it may facilitate translation into an algorithm. The 'Eye Catching' question could then be solved as a scenario with fixed precedence values for the different styles, plus a factor for the widths, etc. Hmm, I think you got the wrong impression. It's not that I'm having problems with the border resolution. This actually works fine by now even if it might need some additional tweaking for calculating new constellations in break conditions. The design of the resolution is already prepared to easily handle the precedence variant. It's just a matter of creating an additional subclass (of CollapsingBorderModel). The data sources for the decisions are there. The real problem lies within the effects that borders have on the generated combined elements list after they have been resolved. I'm sorry for not making that clear enough. Still(read on below) Still, after a look at the code and the Wiki, I had the impression that this path hadn't yet been taken into consideration, so hopefully this offers some relief... Hmm, I actually left that away simply because I thought it would be quite simple. I could be wrong though. Starting with the simplest case, a rough description: p(table) p(body) p(row) p(column) p(cell) means table-border for border-start of the first GridUnit in a Row border-end of the last GridUnit in a Row border-before of all GridUnits in the first Row of a (sub)page border-after of all GridUnits in the last Row of a (sub)page row-border for border-before of all GridUnits not in the first Row of a (sub)page border-after of all GridUnits not in the last Row of a (sub)page column-border for border-start for all GridUnits except when first in a Row border-end for all GridUnits except when last in a Row body-borders and cell-borders are overruled I probably don't get what you're targetting at but one thing disturb me here: you may not have a Row instance. Mind the Capitals, and what I have already mentioned in a previous post --about doing part of the resolving at row-level-- begins to make a bit more sense now. When the BodyLM is initialized, you can already decide between 'table' and 'body' borders (for non-break conditions) and pass that result to the RowLM, I don't use the RowLM anymore. There's only the TableLM, the TableContentLM and the CellLM. I know I should have removed the obsolete LMs by now. I simply was too deep in the mud to notice. The next best place where the functionality of the RowLM lies is the TableRowIterator. You'd probably pass this one the result. that passes that result OR its own border-specs to its GridUnits, and the GridUnits ultimately only have to decide between the relevant 'row'-borders, 'column'-borders and their own... I think one would have a hard time getting closer to the meaning of 'collapsing' than this approach. I can see the potential benefit by not having to take all the influencing border sources into account, but precalculating some border and thus optimizing the code a bit. The beauty of the current approach IMO lies within the concentration of the calculation in one spot. I think your approach would make the border resolution more decentralized and therefor harder to track down in the already complex maze. What seemed a bit awkward while I was browsing through the relevant code was the constant need to pass the 'side' of the GridUnit around when resolving the border :-/ Still, that seems more like a consequence of delaying the entire border-resolving process until the level of the GridUnit is reached. constant need? There are four calls to GridUnit.resolveBorder() in the code, one for each side. There will be a couple of additional ones once we have figured out how to resolve (or better store) the borders for the break conditions. resolveBorder() calls go straight into determineWinner() calls on the CollapsingBorderModel. It's not that awkward, is it? Also, I was juggling with the idea of creating a BorderSegment object that operates in conjunction with the GridUnit, but 'in between and over' Rows as it were... Instead of having a GridUnit 'resolve its own borders', the BorderSegments 'resolve themselves' at the appropriate time. In essence, those segments need to know nothing about 'before' or 'after', 'start' or 'end', they just pick the right border spec from the given set. What gave me this idea, was Simon's example, where you need information
Re: [VOTE] Merge Knuth branch back into HEAD
Jeremias Maerki wrote: I'm not where I would like to be, yet (with table layout). Over all, there is still a number of problems to be solved. These are (potentially incomplete list): - Table layout including header, footer, spans and borders (*) - Markers - before-floats and footnotes - keeps and breaks on tables - strength values for keeps - the other known table-related problems as documented on the Wiki - change of available IPD and BPD between pages - last-page - column-spanning and column balancing I just tried running a sample FO that contained markers and got a nasty error. Are they broken due to the changes for Knuth page breaking. Do you anticipate any pain in fixing them? snip/ My vote: At this point I'm only able to give a +0.95 where the missing 0.05 is due to the fact that the Knuth approach has given me headache after headache. There are pros and cons to the whole approach. I still cannot fully exclude the possibility that we're not going to hit a dead end. And I'm still not comfortable with the complexity in certain areas, although you could probably say that it would be similarly complex with the old approach. Anyway, I've gotten used to thinking in terms of boxes, glue and penalties. Were it not for tables, my vote would have been clearer. I understand why you are not 100% sure on this vote. However, I still believe we are making progress. Im not convinced the Knuth approach leads to a dead end. So heres my +1. I understand peoples concerns on performance. I fully expect it to be slow once we get it working. I believe we should start looking for optimizations and time saving ideas once we have a solution that is working for most scenarios. If we try to make optimisations now, then they will be undone once we implement the missing features. Chris
[VOTE] Merge Knuth branch back into HEAD
I'm not where I would like to be, yet (with table layout). Over all, there is still a number of problems to be solved. These are (potentially incomplete list): - Table layout including header, footer, spans and borders (*) - Markers - before-floats and footnotes - keeps and breaks on tables - strength values for keeps - the other known table-related problems as documented on the Wiki - change of available IPD and BPD between pages - last-page - column-spanning and column balancing (*) ATM I've got the basic algorithm but I'm stuck with the many details that arise from the collapsing border model. I'm going to back off from this for now and instead I'm going to try and at least make the separate border model work. This model doesn't have these nasty interactions between cells that keep my head spinning. Painting this stuff on paper is hard enough, implementing it is even harder. Still, we're at a point where we should finally say yes or no to further pursuing the new page breaking approach. Merging the branch back into HEAD means a step back for a few features and on the other side a step forward especially for keeps. I got the impression that the team is pretty much committed to continue on this path and this vote should confirm that. My vote: At this point I'm only able to give a +0.95 where the missing 0.05 is due to the fact that the Knuth approach has given me headache after headache. There are pros and cons to the whole approach. I still cannot fully exclude the possibility that we're not going to hit a dead end. And I'm still not comfortable with the complexity in certain areas, although you could probably say that it would be similarly complex with the old approach. Anyway, I've gotten used to thinking in terms of boxes, glue and penalties. Were it not for tables, my vote would have been clearer. Jeremias Maerki
Re: [VOTE] Merge Knuth branch back into HEAD
On 10.05.2005 20:41:19 Simon Pepping wrote: My worry with the new approach is performance: We know that the algorithms require quite some computational steps, but we have no idea whether in the end performance on a large document will be acceptable or not. (Perhaps Luca has some experimental evidence from his own implementation?) I still have some performance comparisons on my todo list as preparation for the ApacheCon session. I can run the examples through the new code to get an idea. That's a no-brainer with my API wrapper. I'll keep you posted. Jeremias, what do you mean with complexity in certain areas? Tables only, or are there other complexities that you perceived as overwhelming? No, it's mainly the complexity of the collapsed border model plus the implications from row spanning and if you go further: handling min/opt/max stuff which I dared to simply ignore. There are so many possible interactions. Take the RowBorder2 example. It took me a whole day to run on paper. And it's still not covering all the possible cases. If you remove the column span in the header and do some nasty stuff with the border withs you can create real mean examples. I intend to write one when I'm in a better mood. Jeremias Maerki
RE: [VOTE] Merge Knuth branch back into HEAD
-Original Message- From: Jeremias Maerki [mailto:[EMAIL PROTECTED] On 10.05.2005 20:41:19 Simon Pepping wrote: Hi guys, For starters: my vote is +1. I agree with Simon, and also very much feel like we're on the right track with this. Sure, it will *still* take some work... snip / Jeremias, what do you mean with complexity in certain areas? Tables only, or are there other complexities that you perceived as overwhelming? No, it's mainly the complexity of the collapsed border model ... Yes, I've been thinking and reading up on that stuff, and somehow it seems a bit --a tiny bit-- simpler if you try to figure out 'collapse-with-precedence' first, since you have to decide on a purely numerical basis, so it may facilitate translation into an algorithm. The 'Eye Catching' question could then be solved as a scenario with fixed precedence values for the different styles, plus a factor for the widths, etc. Still, after a look at the code and the Wiki, I had the impression that this path hadn't yet been taken into consideration, so hopefully this offers some relief... Starting with the simplest case, a rough description: p(table) p(body) p(row) p(column) p(cell) means table-border for border-start of the first GridUnit in a Row border-end of the last GridUnit in a Row border-before of all GridUnits in the first Row of a (sub)page border-after of all GridUnits in the last Row of a (sub)page row-border for border-before of all GridUnits not in the first Row of a (sub)page border-after of all GridUnits not in the last Row of a (sub)page column-border for border-start for all GridUnits except when first in a Row border-end for all GridUnits except when last in a Row body-borders and cell-borders are overruled Mind the Capitals, and what I have already mentioned in a previous post --about doing part of the resolving at row-level-- begins to make a bit more sense now. When the BodyLM is initialized, you can already decide between 'table' and 'body' borders and pass that result to the RowLM, that passes that result OR its own border-specs to its GridUnits, and the GridUnits ultimately only have to decide between the relevant 'row'-borders, 'column'-borders and their own... I think one would have a hard time getting closer to the meaning of 'collapsing' than this approach. What seemed a bit awkward while I was browsing through the relevant code was the constant need to pass the 'side' of the GridUnit around when resolving the border :-/ Still, that seems more like a consequence of delaying the entire border-resolving process until the level of the GridUnit is reached. Also, I was juggling with the idea of creating a BorderSegment object that operates in conjunction with the GridUnit, but 'in between and over' Rows as it were... Instead of having a GridUnit 'resolve its own borders', the BorderSegments 'resolve themselves' at the appropriate time. In essence, those segments need to know nothing about 'before' or 'after', 'start' or 'end', they just pick the right border spec from the given set. What gave me this idea, was Simon's example, where you need information about the GridUnits for the full two rows --to know how many segments there are, how they are distributed and which sets of border-specs are relevant for each of the segments. When the first row is finished, we would have two unfinished segments of after-borders. Make these available to the next row as preliminary before-border segments, which it can then resolve with its own. Next, we already know that this row contains three cells, so we need some sort of distribution method here --i.e. calculations based on relevant GridUnit IPD?-- to give an indication as to which segment(s) need to be split into how many parts Then again, it seems only *really* necessary for before- and after-borders. The border-specs for the vertical border segments could be made available to a GridUnit through the Column (? via the Row's column list: end-border of previous GridUnit = the resolved start-border of the current GridUnit's Column --Or am I thinking too linear --too LRTB, maybe?) In theory --here I go again...-- it would then be the BorderSegments that need information on the border specs on Table/Body/Row/(Column?)/Cell for at most two cells at the same time. I don't know if, in practice, this idea would save much compared to what you currently have... but it somehow seems attractive, especially in combination with the approach of resolving in different stages. Hope this helps! :-) Cheers, Andreas
Re: [VOTE] Merge Knuth branch back into HEAD
Jeremias Maerki wrote: Still, we're at a point where we should finally say yes or no to further pursuing the new page breaking approach. Merging the branch back into HEAD means a step back for a few features and on the other side a step forward especially for keeps. I got the impression that the team is pretty much committed to continue on this path and this vote should confirm that. The team has made remarkable progress in this. My congratulations. From the outside, I share the reservations expressed by Jeremias and Simon. It will be an extremely impressive achievement if they are all resolved. Peter -- Peter B. West http://cv.pbw.id.au/ Folio http://defoe.sourceforge.net/folio/ http://folio.bkbits.net/