Re: Hacking a temporary solution for Changing IPD
On 19 Jun 2009, at 15:58, Andreas Delmelle wrote: Just for the sake of completeness: On 19 Jun 2009, at 15:46, Vincent Hennebert wrote: Until page breaking has been done you don’t know where the paragraph will be broken. So you don’t know how far in the element generation you must go. Not true... Actually, the first part /is/ true, but the conclusion is wrong. What I actually meant was that we cannot let ourselves be tempted to think that we need to effectively compute the first break to know / whether/ a paragraph will be broken (or: whether there will be at least one break). Regards Andreas
Re: Hacking a temporary solution for Changing IPD
On 19 Jun 2009, at 15:46, Vincent Hennebert wrote: Until page breaking has been done you don’t know where the paragraph will be broken. So you don’t know how far in the element generation you must go. Not true. Until page breaking has been done, we don't know /exactly/ where the paragraph will be broken, but that is not necessary. At the point where I would see the intervention, all the line-breaks for the paragraph have been computed, and the elements simply represent the line-boxes (+ possibly, spaces in between). If we calculate the total content-height for those line-boxes, include a pessimistic estimate for the line-spacing, and compare that to the available height, we know that there will very, very likely have to be a break. More is not really necessary. Regards Andreas
Re: Hacking a temporary solution for Changing IPD
Hi, Andreas Delmelle wrote: > On 18 Jun 2009, at 20:26, Simon Pepping wrote: > > Hi Simon, Vincent, > >> >> I get the feeling that this means that you are planning to relaunch >> line breaking while in the page breaking phase. Yes. >> Is that possible? That’s what I am currently trying to figure out. Getting the index of the last element on the previous page is not too difficult, restarting element generation and resetting all applicable variables is harder. > What I was wondering too, and why I somehow believe that suspending > line-breaking and starting the page-breaking phase earlier is likely to > be 'easier' to implement (even though the required changes would impact > a lot more classes). It would basically come down to extending the > approach that is currently taken when dealing with span-changes or > forced page- or column-breaks: exit the line-breaking loop, start > page-breaking for the element-list up to that point, and return later to > continue with the line-breaking. Currently, this approach always uses > separate PageBreakingAlgorithm instances, but I don't think it would be > very hard to re-use the same PBA and append the new elements to its > 'current paragraph'. Until page breaking has been done you don’t know where the paragraph will be broken. So you don’t know how far in the element generation you must go. >> It is a bit strange that changing IPD would waste CPU cycles, as it >> could be achieved by the simpler of two strategies, best-fit versus >> total-fit. Theoretically speaking it shouldn’t, in practice the code is not designed to switch back and forth from line generation to page generation. Throwing the elements away and re-creating them with the new page width in mind appears to be simpler to achieve, and should be good enough for this short-term solution. >> But maybe implementing best-fit next to total-fit would be >> more than a simple solution. >> >> I agree that it would be an improvement to have even a partial >> solution for the changing IPD feature. Good luck with this necessary >> job. Thanks :-) Vincent
Re: Hacking a temporary solution for Changing IPD
On 18 Jun 2009, at 20:26, Simon Pepping wrote: Hi Simon, Vincent, I get the feeling that this means that you are planning to relaunch line breaking while in the page breaking phase. Is that possible? What I was wondering too, and why I somehow believe that suspending line-breaking and starting the page-breaking phase earlier is likely to be 'easier' to implement (even though the required changes would impact a lot more classes). It would basically come down to extending the approach that is currently taken when dealing with span-changes or forced page- or column-breaks: exit the line-breaking loop, start page- breaking for the element-list up to that point, and return later to continue with the line-breaking. Currently, this approach always uses separate PageBreakingAlgorithm instances, but I don't think it would be very hard to re-use the same PBA and append the new elements to its 'current paragraph'. Regards Andreas
Re: Hacking a temporary solution for Changing IPD
Just back online. On Thu, Jun 11, 2009 at 04:57:27PM +0200, Andreas Delmelle wrote: > On 11 Jun 2009, at 12:40, Vincent Hennebert wrote: > > Hi Vincent > > >> I spent some time looking at the current code and it seems to me that >> a hack could be implemented without too many difficulties. It >> basically >> consists in 2 steps: >> 1. in the Knuth breaking algorithm, when creating a new active node, >> look whether the IPD for the following page is the same or not. If >> not, deactivate the node. Once we run out of active nodes, select >> the >> best of those deactivated nodes and treat it as if it were the >> regular final node. Add areas for content up to that node. >> >> 2. re-create Knuth elements, starting from the index corresponding to >> that node. Re-launch the breaking algorithm, starting from there. >> Then back to step 1, until the end of the document is reached. >> >> Step 1 should be doable without turning everything upside down. > >> Step 2 implies to change the signature of the >> LayoutManager.getNextKnuthElements method, adding a parameter that >> corresponds to the index from where to start the generation of Knuth >> elements. We could largely ignore it, except in BlockLayoutManager >> where >> we would re-launch the line-breaking algorithm, taking the new IPD >> into >> account. > > If I interpret correctly, we would (supposing a page-sequence without > forced breaks and/or span changes): > > a) generate the complete block list (effectively computing the line- > breaks for the whole page-sequence) > b) when computing the page-breaks, and encountering a new page with > different available IPD, re-generate the remaining elements and > recompute the line-breaks after that position > > b) would occur as many times as we have IPD changes. I get the feeling that this means that you are planning to relaunch line breaking while in the page breaking phase. Is that possible? It is a bit strange that changing IPD would waste CPU cycles, as it could be achieved by the simpler of two strategies, best-fit versus total-fit. But maybe implementing best-fit next to total-fit would be more than a simple solution. I agree that it would be an improvement to have even a partial solution for the changing IPD feature. Good luck with this necessary job. Simon -- Simon Pepping home page: http://www.leverkruid.eu
Re: Hacking a temporary solution for Changing IPD
On 12 Jun 2009, at 13:23, Vincent Hennebert wrote: Hi Vincent [Me:] I'm wondering whether it would not be equally feasible to have the FlowLM check the total height of the block-boxes up to a point (estimated). Do this after every call to childLM.getNextKnuthElements(). Then, if the total height exceeds the BPD for the current page, call back to check if the next page has the same IPD (or the same amount of columns). If so, then we happily continue, just as we do now. If not, then we hand the list off to the PageBreaker, run it through a PageBreakingAlgorithm to compute the page-breaks up to that point, add the areas, and resume later, passing the LineBreakingAlgorithm an updated LayoutContext corresponding to the next page. While this is a good idea, it will imply more disruptive changes to the codebase. I’d like to keep the necessary changes to a strict minimum. Plus I’m not even sure yet that my idea will not lead to a dead end. Your idea is not incompatible with mine, however. It’s an additional step performing some optimization. We can always re-consider it once I have some working code. OK, I already played with it a bit last night, just to see whether I was not proposing the impossible, and it seems to be feasible. As you note, it will require some more invasive changes across more classes than simply the BreakingAlgorithm. I'll set it aside for the moment, but will surely pick it back up once you have developed your approach a bit further. Later Andreas
Re: Hacking a temporary solution for Changing IPD
Hi Andreas, Andreas Delmelle wrote: > On 11 Jun 2009, at 12:40, Vincent Hennebert wrote: > > Hi Vincent > > >> I spent some time looking at the current code and it seems to me that >> a hack could be implemented without too many difficulties. It basically >> consists in 2 steps: >> 1. in the Knuth breaking algorithm, when creating a new active node, >> look whether the IPD for the following page is the same or not. If >> not, deactivate the node. Once we run out of active nodes, select the >> best of those deactivated nodes and treat it as if it were the >> regular final node. Add areas for content up to that node. >> >> 2. re-create Knuth elements, starting from the index corresponding to >> that node. Re-launch the breaking algorithm, starting from there. >> Then back to step 1, until the end of the document is reached. >> >> Step 1 should be doable without turning everything upside down. > >> Step 2 implies to change the signature of the >> LayoutManager.getNextKnuthElements method, adding a parameter that >> corresponds to the index from where to start the generation of Knuth >> elements. We could largely ignore it, except in BlockLayoutManager where >> we would re-launch the line-breaking algorithm, taking the new IPD into >> account. >> >> Obviously this is a limited approach. There is likely to be >> a (potentially huge) waste of CPU cycles due to the re-creation of Knuth >> elements. There may be side effects that I’ve missed so far. But I think >> it’s worth giving it a try. > > The only thing I'm slightly concerned about, is the case where there > would be multiple IPD changes for subsequent pages. I'm assuming that is > the waste of CPU cycles you're referring to (?) That’s right. > If I interpret correctly, we would (supposing a page-sequence without > forced breaks and/or span changes): > > a) generate the complete block list (effectively computing the > line-breaks for the whole page-sequence) > b) when computing the page-breaks, and encountering a new page with > different available IPD, re-generate the remaining elements and > recompute the line-breaks after that position > > b) would occur as many times as we have IPD changes. > > I'm wondering whether it would not be equally feasible to have the > FlowLM check the total height of the block-boxes up to a point > (estimated). Do this after every call to childLM.getNextKnuthElements(). > Then, if the total height exceeds the BPD for the current page, call > back to check if the next page has the same IPD (or the same amount of > columns). If so, then we happily continue, just as we do now. If not, > then we hand the list off to the PageBreaker, run it through a > PageBreakingAlgorithm to compute the page-breaks up to that point, add > the areas, and resume later, passing the LineBreakingAlgorithm an > updated LayoutContext corresponding to the next page. While this is a good idea, it will imply more disruptive changes to the codebase. I’d like to keep the necessary changes to a strict minimum. Plus I’m not even sure yet that my idea will not lead to a dead end. Your idea is not incompatible with mine, however. It’s an additional step performing some optimization. We can always re-consider it once I have some working code. > If this functionality could somehow be factored in to BlockStackingLM, > all the better, since we're definitely going to need it to avoid > BlockLMs and TableLMs from accumulating the element-lists for all their > content. They too will need to be able to stop when their accumulated > content-height exceeds the threshold (available BPD + a percentage?) I’m not planning to do any change to lists or tables. Tables in particular may create all sorts of problems with collapsed borders and repeated headers. This will be leading us too far for what is just a temporary hack. > Step 2) would still be necessary, since we need to know at what point to > resume the line-breaking later on. > > The benefit being that we would catch the IPD-change long before the end > of the page-sequence is reached by the line-breaking loop. The amount of > elements that need to be re-generated would always remain as small as > possible. > > If I judge correctly, and it is feasible, then this may present an > opportunity for a pure FO hack to reduce the memory consumption for > arbitrarily sized page-sequences: use alternating page-masters with a > different IPD. Only make the difference practically invisible to naked > eye. FOP would still detect it, and flush the list up to that point. > > Thoughts, you asked? ;-) Thanks! > Andreas Delmelle Vincent
Re: Hacking a temporary solution for Changing IPD
On 11 Jun 2009, at 12:40, Vincent Hennebert wrote: Hi Vincent I spent some time looking at the current code and it seems to me that a hack could be implemented without too many difficulties. It basically consists in 2 steps: 1. in the Knuth breaking algorithm, when creating a new active node, look whether the IPD for the following page is the same or not. If not, deactivate the node. Once we run out of active nodes, select the best of those deactivated nodes and treat it as if it were the regular final node. Add areas for content up to that node. 2. re-create Knuth elements, starting from the index corresponding to that node. Re-launch the breaking algorithm, starting from there. Then back to step 1, until the end of the document is reached. Step 1 should be doable without turning everything upside down. Step 2 implies to change the signature of the LayoutManager.getNextKnuthElements method, adding a parameter that corresponds to the index from where to start the generation of Knuth elements. We could largely ignore it, except in BlockLayoutManager where we would re-launch the line-breaking algorithm, taking the new IPD into account. Obviously this is a limited approach. There is likely to be a (potentially huge) waste of CPU cycles due to the re-creation of Knuth elements. There may be side effects that I’ve missed so far. But I think it’s worth giving it a try. The only thing I'm slightly concerned about, is the case where there would be multiple IPD changes for subsequent pages. I'm assuming that is the waste of CPU cycles you're referring to (?) If I interpret correctly, we would (supposing a page-sequence without forced breaks and/or span changes): a) generate the complete block list (effectively computing the line- breaks for the whole page-sequence) b) when computing the page-breaks, and encountering a new page with different available IPD, re-generate the remaining elements and recompute the line-breaks after that position b) would occur as many times as we have IPD changes. I'm wondering whether it would not be equally feasible to have the FlowLM check the total height of the block-boxes up to a point (estimated). Do this after every call to childLM.getNextKnuthElements(). Then, if the total height exceeds the BPD for the current page, call back to check if the next page has the same IPD (or the same amount of columns). If so, then we happily continue, just as we do now. If not, then we hand the list off to the PageBreaker, run it through a PageBreakingAlgorithm to compute the page-breaks up to that point, add the areas, and resume later, passing the LineBreakingAlgorithm an updated LayoutContext corresponding to the next page. If this functionality could somehow be factored in to BlockStackingLM, all the better, since we're definitely going to need it to avoid BlockLMs and TableLMs from accumulating the element-lists for all their content. They too will need to be able to stop when their accumulated content-height exceeds the threshold (available BPD + a percentage?) Step 2) would still be necessary, since we need to know at what point to resume the line-breaking later on. The benefit being that we would catch the IPD-change long before the end of the page-sequence is reached by the line-breaking loop. The amount of elements that need to be re-generated would always remain as small as possible. If I judge correctly, and it is feasible, then this may present an opportunity for a pure FO hack to reduce the memory consumption for arbitrarily sized page-sequences: use alternating page-masters with a different IPD. Only make the difference practically invisible to naked eye. FOP would still detect it, and flush the list up to that point. Thoughts, you asked? ;-) Andreas Delmelle
Re: Hacking a temporary solution for Changing IPD
Hi Chris, Chris Bowditch wrote: > Vincent Hennebert wrote: > >> Hi All, > > Hi Vincent, > >> Obviously this is a limited approach. There is likely to be >> a (potentially huge) waste of CPU cycles due to the re-creation of Knuth >> elements. There may be side effects that I’ve missed so far. But I think >> it’s worth giving it a try. > > Will the increase in CPU cycles only occur if there is a change in IPD, > or will this affect performance for existing users with the same width > on every page? No, the impact on performance will occur only when the IPD changes. Performance should remain the same for documents that are already supported by the current code. >> What I’m planning to do is create a branch in which I would >> experiment >> that idea. If it turns out to be feasible then we could merge it back to >> Trunk, advertise the ‘fix’ and document its limitations. And hopefully >> that will relieve some pressure on the implementation of the more >> complete new approach. > > Thanks, > > Chris Vincent
Re: Hacking a temporary solution for Changing IPD
Vincent Hennebert wrote: Hi All, Hi Vincent, While I am making rather good progress on my prototype implementation of a new layout approach, I am not likely to get any practical results before some time (a year or more). Meanwhile, users keep bumping into this Changing IPD issue, which is becoming more and more problematic. It would be good if we had some partial, temporary solution with which users could live until the new approach has been implemented. In many cases, the ‘Changing IPD’ problem boils down to the first page of the sequence having some sort of side region encroaching upon the main content, and all the following pages having the same width. And the content is usually made of just plain text (fo:block elements). I have also seen use cases along those lines. I spent some time looking at the current code and it seems to me that a hack could be implemented without too many difficulties. It basically consists in 2 steps: 1. in the Knuth breaking algorithm, when creating a new active node, look whether the IPD for the following page is the same or not. If not, deactivate the node. Once we run out of active nodes, select the best of those deactivated nodes and treat it as if it were the regular final node. Add areas for content up to that node. 2. re-create Knuth elements, starting from the index corresponding to that node. Re-launch the breaking algorithm, starting from there. Then back to step 1, until the end of the document is reached. Step 1 should be doable without turning everything upside down. Step 2 implies to change the signature of the LayoutManager.getNextKnuthElements method, adding a parameter that corresponds to the index from where to start the generation of Knuth elements. We could largely ignore it, except in BlockLayoutManager where we would re-launch the line-breaking algorithm, taking the new IPD into account. I can't offer much technical help with the above as I'm not familar with those areas of the code. Obviously this is a limited approach. There is likely to be a (potentially huge) waste of CPU cycles due to the re-creation of Knuth elements. There may be side effects that I’ve missed so far. But I think it’s worth giving it a try. Will the increase in CPU cycles only occur if there is a change in IPD, or will this affect performance for existing users with the same width on every page? What I’m planning to do is create a branch in which I would experiment that idea. If it turns out to be feasible then we could merge it back to Trunk, advertise the ‘fix’ and document its limitations. And hopefully that will relieve some pressure on the implementation of the more complete new approach. Thanks, Chris
Hacking a temporary solution for Changing IPD
Hi All, While I am making rather good progress on my prototype implementation of a new layout approach, I am not likely to get any practical results before some time (a year or more). Meanwhile, users keep bumping into this Changing IPD issue, which is becoming more and more problematic. It would be good if we had some partial, temporary solution with which users could live until the new approach has been implemented. In many cases, the ‘Changing IPD’ problem boils down to the first page of the sequence having some sort of side region encroaching upon the main content, and all the following pages having the same width. And the content is usually made of just plain text (fo:block elements). I spent some time looking at the current code and it seems to me that a hack could be implemented without too many difficulties. It basically consists in 2 steps: 1. in the Knuth breaking algorithm, when creating a new active node, look whether the IPD for the following page is the same or not. If not, deactivate the node. Once we run out of active nodes, select the best of those deactivated nodes and treat it as if it were the regular final node. Add areas for content up to that node. 2. re-create Knuth elements, starting from the index corresponding to that node. Re-launch the breaking algorithm, starting from there. Then back to step 1, until the end of the document is reached. Step 1 should be doable without turning everything upside down. Step 2 implies to change the signature of the LayoutManager.getNextKnuthElements method, adding a parameter that corresponds to the index from where to start the generation of Knuth elements. We could largely ignore it, except in BlockLayoutManager where we would re-launch the line-breaking algorithm, taking the new IPD into account. Obviously this is a limited approach. There is likely to be a (potentially huge) waste of CPU cycles due to the re-creation of Knuth elements. There may be side effects that I’ve missed so far. But I think it’s worth giving it a try. What I’m planning to do is create a branch in which I would experiment that idea. If it turns out to be feasible then we could merge it back to Trunk, advertise the ‘fix’ and document its limitations. And hopefully that will relieve some pressure on the implementation of the more complete new approach. Thoughts, anyone? Thanks, Vincent