Re: Hacking a temporary solution for Changing IPD

2009-06-19 Thread Andreas Delmelle

On 19 Jun 2009, at 15:58, Andreas Delmelle wrote:

Just for the sake of completeness:


On 19 Jun 2009, at 15:46, Vincent Hennebert wrote:



Until page breaking has been done you don’t know where the paragraph
will be broken. So you don’t know how far in the element generation  
you

must go.


Not true...


Actually, the first part /is/ true, but the conclusion is wrong.
What I actually meant was that we cannot let ourselves be tempted to  
think that we need to effectively compute the first break to know / 
whether/ a paragraph will be broken (or: whether there will be at  
least one break).


Regards

Andreas

Re: Hacking a temporary solution for Changing IPD

2009-06-19 Thread Andreas Delmelle

On 19 Jun 2009, at 15:46, Vincent Hennebert wrote:



Until page breaking has been done you don’t know where the paragraph
will be broken. So you don’t know how far in the element generation  
you

must go.


Not true. Until page breaking has been done, we don't know /exactly/  
where the paragraph will be broken, but that is not necessary. At the  
point where I would see the intervention, all the line-breaks for the  
paragraph have been computed, and the elements simply represent the  
line-boxes (+ possibly, spaces in between).
If we calculate the total content-height for those line-boxes, include  
a pessimistic estimate for the line-spacing, and compare that to the  
available height, we know that there will very, very likely have to be  
a break. More is not really necessary.



Regards

Andreas

Re: Hacking a temporary solution for Changing IPD

2009-06-19 Thread Vincent Hennebert
Hi,

Andreas Delmelle wrote:
> On 18 Jun 2009, at 20:26, Simon Pepping wrote:
> 
> Hi Simon, Vincent,
> 
>> 
>> I get the feeling that this means that you are planning to relaunch
>> line breaking while in the page breaking phase.

Yes.


>> Is that possible?

That’s what I am currently trying to figure out. Getting the index of
the last element on the previous page is not too difficult, restarting
element generation and resetting all applicable variables is harder.


> What I was wondering too, and why I somehow believe that suspending
> line-breaking and starting the page-breaking phase earlier is likely to
> be 'easier' to implement (even though the required changes would impact
> a lot more classes). It would basically come down to extending the
> approach that is currently taken when dealing with span-changes or
> forced page- or column-breaks: exit the line-breaking loop, start
> page-breaking for the element-list up to that point, and return later to
> continue with the line-breaking. Currently, this approach always uses
> separate PageBreakingAlgorithm instances, but I don't think it would be
> very hard to re-use the same PBA and append the new elements to its
> 'current paragraph'.

Until page breaking has been done you don’t know where the paragraph
will be broken. So you don’t know how far in the element generation you
must go.


>> It is a bit strange that changing IPD would waste CPU cycles, as it
>> could be achieved by the simpler of two strategies, best-fit versus
>> total-fit.

Theoretically speaking it shouldn’t, in practice the code is not
designed to switch back and forth from line generation to page
generation. Throwing the elements away and re-creating them with the new
page width in mind appears to be simpler to achieve, and should be good
enough for this short-term solution.


>> But maybe implementing best-fit next to total-fit would be
>> more than a simple solution.
>> 
>> I agree that it would be an improvement to have even a partial
>> solution for the changing IPD feature. Good luck with this necessary
>> job.

Thanks :-)

Vincent


Re: Hacking a temporary solution for Changing IPD

2009-06-18 Thread Andreas Delmelle

On 18 Jun 2009, at 20:26, Simon Pepping wrote:

Hi Simon, Vincent,



I get the feeling that this means that you are planning to relaunch
line breaking while in the page breaking phase. Is that possible?


What I was wondering too, and why I somehow believe that suspending  
line-breaking and starting the page-breaking phase earlier is likely  
to be 'easier' to implement (even though the required changes would  
impact a lot more classes). It would basically come down to extending  
the approach that is currently taken when dealing with span-changes or  
forced page- or column-breaks: exit the line-breaking loop, start page- 
breaking for the element-list up to that point, and return later to  
continue with the line-breaking. Currently, this approach always uses  
separate PageBreakingAlgorithm instances, but I don't think it would  
be very hard to re-use the same PBA and append the new elements to its  
'current paragraph'.



Regards

Andreas


Re: Hacking a temporary solution for Changing IPD

2009-06-18 Thread Simon Pepping
Just back online.

On Thu, Jun 11, 2009 at 04:57:27PM +0200, Andreas Delmelle wrote:
> On 11 Jun 2009, at 12:40, Vincent Hennebert wrote:
>
> Hi Vincent
>
> 
>> I spent some time looking at the current code and it seems to me that
>> a hack could be implemented without too many difficulties. It  
>> basically
>> consists in 2 steps:
>> 1. in the Knuth breaking algorithm, when creating a new active node,
>>   look whether the IPD for the following page is the same or not. If
>>   not, deactivate the node. Once we run out of active nodes, select  
>> the
>>   best of those deactivated nodes and treat it as if it were the
>>   regular final node. Add areas for content up to that node.
>>
>> 2. re-create Knuth elements, starting from the index corresponding to
>>   that node. Re-launch the breaking algorithm, starting from there.
>>   Then back to step 1, until the end of the document is reached.
>>
>> Step 1 should be doable without turning everything upside down.
>
>> Step 2 implies to change the signature of the
>> LayoutManager.getNextKnuthElements method, adding a parameter that
>> corresponds to the index from where to start the generation of Knuth
>> elements. We could largely ignore it, except in BlockLayoutManager  
>> where
>> we would re-launch the line-breaking algorithm, taking the new IPD  
>> into
>> account.
> 
> If I interpret correctly, we would (supposing a page-sequence without  
> forced breaks and/or span changes):
>
> a) generate the complete block list (effectively computing the line- 
> breaks for the whole page-sequence)
> b) when computing the page-breaks, and encountering a new page with  
> different available IPD, re-generate the remaining elements and  
> recompute the line-breaks after that position
>
> b) would occur as many times as we have IPD changes.

I get the feeling that this means that you are planning to relaunch
line breaking while in the page breaking phase. Is that possible?

It is a bit strange that changing IPD would waste CPU cycles, as it
could be achieved by the simpler of two strategies, best-fit versus
total-fit. But maybe implementing best-fit next to total-fit would be
more than a simple solution.

I agree that it would be an improvement to have even a partial
solution for the changing IPD feature. Good luck with this necessary
job.

Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu


Re: Hacking a temporary solution for Changing IPD

2009-06-12 Thread Andreas Delmelle

On 12 Jun 2009, at 13:23, Vincent Hennebert wrote:

Hi Vincent

  [Me:]

I'm wondering whether it would not be equally feasible to have the
FlowLM check the total height of the block-boxes up to a point
(estimated). Do this after every call to  
childLM.getNextKnuthElements().

Then, if the total height exceeds the BPD for the current page, call
back to check if the next page has the same IPD (or the same amount  
of

columns). If so, then we happily continue, just as we do now. If not,
then we hand the list off to the PageBreaker, run it through a
PageBreakingAlgorithm to compute the page-breaks up to that point,  
add

the areas, and resume later, passing the LineBreakingAlgorithm an
updated LayoutContext corresponding to the next page.


While this is a good idea, it will imply more disruptive changes to  
the

codebase. I’d like to keep the necessary changes to a strict minimum.
Plus I’m not even sure yet that my idea will not lead to a dead end.

Your idea is not incompatible with mine, however. It’s an additional
step performing some optimization. We can always re-consider it once
I have some working code.


OK, I already played with it a bit last night, just to see whether I  
was not proposing the impossible, and it seems to be feasible. As you  
note, it will require some more invasive changes across more classes  
than simply the BreakingAlgorithm.


I'll set it aside for the moment, but will surely pick it back up once  
you have developed your approach a bit further.



Later

Andreas

Re: Hacking a temporary solution for Changing IPD

2009-06-12 Thread Vincent Hennebert
Hi Andreas,

Andreas Delmelle wrote:
> On 11 Jun 2009, at 12:40, Vincent Hennebert wrote:
> 
> Hi Vincent
> 
> 
>> I spent some time looking at the current code and it seems to me that
>> a hack could be implemented without too many difficulties. It basically
>> consists in 2 steps:
>> 1. in the Knuth breaking algorithm, when creating a new active node,
>>   look whether the IPD for the following page is the same or not. If
>>   not, deactivate the node. Once we run out of active nodes, select the
>>   best of those deactivated nodes and treat it as if it were the
>>   regular final node. Add areas for content up to that node.
>>
>> 2. re-create Knuth elements, starting from the index corresponding to
>>   that node. Re-launch the breaking algorithm, starting from there.
>>   Then back to step 1, until the end of the document is reached.
>>
>> Step 1 should be doable without turning everything upside down.
> 
>> Step 2 implies to change the signature of the
>> LayoutManager.getNextKnuthElements method, adding a parameter that
>> corresponds to the index from where to start the generation of Knuth
>> elements. We could largely ignore it, except in BlockLayoutManager where
>> we would re-launch the line-breaking algorithm, taking the new IPD into
>> account.
>>
>> Obviously this is a limited approach. There is likely to be
>> a (potentially huge) waste of CPU cycles due to the re-creation of Knuth
>> elements. There may be side effects that I’ve missed so far. But I think
>> it’s worth giving it a try.
> 
> The only thing I'm slightly concerned about, is the case where there
> would be multiple IPD changes for subsequent pages. I'm assuming that is
> the waste of CPU cycles you're referring to (?)

That’s right.


> If I interpret correctly, we would (supposing a page-sequence without
> forced breaks and/or span changes):
> 
> a) generate the complete block list (effectively computing the
> line-breaks for the whole page-sequence)
> b) when computing the page-breaks, and encountering a new page with
> different available IPD, re-generate the remaining elements and
> recompute the line-breaks after that position
> 
> b) would occur as many times as we have IPD changes.
> 
> I'm wondering whether it would not be equally feasible to have the
> FlowLM check the total height of the block-boxes up to a point
> (estimated). Do this after every call to childLM.getNextKnuthElements().
> Then, if the total height exceeds the BPD for the current page, call
> back to check if the next page has the same IPD (or the same amount of
> columns). If so, then we happily continue, just as we do now. If not,
> then we hand the list off to the PageBreaker, run it through a
> PageBreakingAlgorithm to compute the page-breaks up to that point, add
> the areas, and resume later, passing the LineBreakingAlgorithm an
> updated LayoutContext corresponding to the next page.

While this is a good idea, it will imply more disruptive changes to the
codebase. I’d like to keep the necessary changes to a strict minimum.
Plus I’m not even sure yet that my idea will not lead to a dead end.

Your idea is not incompatible with mine, however. It’s an additional
step performing some optimization. We can always re-consider it once
I have some working code.


> If this functionality could somehow be factored in to BlockStackingLM,
> all the better, since we're definitely going to need it to avoid
> BlockLMs and TableLMs from accumulating the element-lists for all their
> content. They too will need to be able to stop when their accumulated
> content-height exceeds the threshold (available BPD + a percentage?)

I’m not planning to do any change to lists or tables. Tables in
particular may create all sorts of problems with collapsed borders and
repeated headers. This will be leading us too far for what is just
a temporary hack.


> Step 2) would still be necessary, since we need to know at what point to
> resume the line-breaking later on.
> 
> The benefit being that we would catch the IPD-change long before the end
> of the page-sequence is reached by the line-breaking loop. The amount of
> elements that need to be re-generated would always remain as small as
> possible.
> 
> If I judge correctly, and it is feasible, then this may present an
> opportunity for a pure FO hack to reduce the memory consumption for
> arbitrarily sized page-sequences: use alternating page-masters with a
> different IPD. Only make the difference practically invisible to naked
> eye. FOP would still detect it, and flush the list up to that point.
>
> Thoughts, you asked? ;-)

Thanks!


> Andreas Delmelle

Vincent


Re: Hacking a temporary solution for Changing IPD

2009-06-11 Thread Andreas Delmelle

On 11 Jun 2009, at 12:40, Vincent Hennebert wrote:

Hi Vincent



I spent some time looking at the current code and it seems to me that
a hack could be implemented without too many difficulties. It  
basically

consists in 2 steps:
1. in the Knuth breaking algorithm, when creating a new active node,
  look whether the IPD for the following page is the same or not. If
  not, deactivate the node. Once we run out of active nodes, select  
the

  best of those deactivated nodes and treat it as if it were the
  regular final node. Add areas for content up to that node.

2. re-create Knuth elements, starting from the index corresponding to
  that node. Re-launch the breaking algorithm, starting from there.
  Then back to step 1, until the end of the document is reached.

Step 1 should be doable without turning everything upside down.



Step 2 implies to change the signature of the
LayoutManager.getNextKnuthElements method, adding a parameter that
corresponds to the index from where to start the generation of Knuth
elements. We could largely ignore it, except in BlockLayoutManager  
where
we would re-launch the line-breaking algorithm, taking the new IPD  
into

account.

Obviously this is a limited approach. There is likely to be
a (potentially huge) waste of CPU cycles due to the re-creation of  
Knuth
elements. There may be side effects that I’ve missed so far. But I  
think

it’s worth giving it a try.


The only thing I'm slightly concerned about, is the case where there  
would be multiple IPD changes for subsequent pages. I'm assuming that  
is the waste of CPU cycles you're referring to (?)
If I interpret correctly, we would (supposing a page-sequence without  
forced breaks and/or span changes):


a) generate the complete block list (effectively computing the line- 
breaks for the whole page-sequence)
b) when computing the page-breaks, and encountering a new page with  
different available IPD, re-generate the remaining elements and  
recompute the line-breaks after that position


b) would occur as many times as we have IPD changes.

I'm wondering whether it would not be equally feasible to have the  
FlowLM check the total height of the block-boxes up to a point  
(estimated). Do this after every call to  
childLM.getNextKnuthElements(). Then, if the total height exceeds the  
BPD for the current page, call back to check if the next page has the  
same IPD (or the same amount of columns). If so, then we happily  
continue, just as we do now. If not, then we hand the list off to the  
PageBreaker, run it through a PageBreakingAlgorithm to compute the  
page-breaks up to that point, add the areas, and resume later, passing  
the LineBreakingAlgorithm an updated LayoutContext corresponding to  
the next page.


If this functionality could somehow be factored in to BlockStackingLM,  
all the better, since we're definitely going to need it to avoid  
BlockLMs and TableLMs from accumulating the element-lists for all  
their content. They too will need to be able to stop when their  
accumulated content-height exceeds the threshold (available BPD + a  
percentage?)


Step 2) would still be necessary, since we need to know at what point  
to resume the line-breaking later on.


The benefit being that we would catch the IPD-change long before the  
end of the page-sequence is reached by the line-breaking loop. The  
amount of elements that need to be re-generated would always remain as  
small as possible.


If I judge correctly, and it is feasible, then this may present an  
opportunity for a pure FO hack to reduce the memory consumption for  
arbitrarily sized page-sequences: use alternating page-masters with a  
different IPD. Only make the difference practically invisible to naked  
eye. FOP would still detect it, and flush the list up to that point.


Thoughts, you asked? ;-)


Andreas Delmelle






Re: Hacking a temporary solution for Changing IPD

2009-06-11 Thread Vincent Hennebert
Hi Chris,

Chris Bowditch wrote:
> Vincent Hennebert wrote:
> 
>> Hi All,
> 
> Hi Vincent,
> 

>> Obviously this is a limited approach. There is likely to be
>> a (potentially huge) waste of CPU cycles due to the re-creation of Knuth
>> elements. There may be side effects that I’ve missed so far. But I think
>> it’s worth giving it a try.
> 
> Will the increase in CPU cycles only occur if there is a change in IPD,
> or will this affect performance for existing users with the same width
> on every page?

No, the impact on performance will occur only when the IPD changes.
Performance should remain the same for documents that are already
supported by the current code.


>> What I’m planning to do is create a branch in which I would 
>> experiment
>> that idea. If it turns out to be feasible then we could merge it back to
>> Trunk, advertise the ‘fix’ and document its limitations. And hopefully
>> that will relieve some pressure on the implementation of the more
>> complete new approach.
> 
> Thanks,
> 
> Chris

Vincent


Re: Hacking a temporary solution for Changing IPD

2009-06-11 Thread Chris Bowditch

Vincent Hennebert wrote:


Hi All,


Hi Vincent,



While I am making rather good progress on my prototype implementation of
a new layout approach, I am not likely to get any practical results
before some time (a year or more). Meanwhile, users keep bumping into
this Changing IPD issue, which is becoming more and more problematic.

It would be good if we had some partial, temporary solution with which
users could live until the new approach has been implemented. In many
cases, the ‘Changing IPD’ problem boils down to the first page of the
sequence having some sort of side region encroaching upon the main
content, and all the following pages having the same width. And the
content is usually made of just plain text (fo:block elements).


I have also seen use cases along those lines.



I spent some time looking at the current code and it seems to me that
a hack could be implemented without too many difficulties. It basically
consists in 2 steps:
1. in the Knuth breaking algorithm, when creating a new active node,
   look whether the IPD for the following page is the same or not. If
   not, deactivate the node. Once we run out of active nodes, select the
   best of those deactivated nodes and treat it as if it were the
   regular final node. Add areas for content up to that node.

2. re-create Knuth elements, starting from the index corresponding to
   that node. Re-launch the breaking algorithm, starting from there.
   Then back to step 1, until the end of the document is reached.

Step 1 should be doable without turning everything upside down. Step
2 implies to change the signature of the
LayoutManager.getNextKnuthElements method, adding a parameter that
corresponds to the index from where to start the generation of Knuth
elements. We could largely ignore it, except in BlockLayoutManager where
we would re-launch the line-breaking algorithm, taking the new IPD into
account.


I can't offer much technical help with the above as I'm not familar with 
those areas of the code.




Obviously this is a limited approach. There is likely to be
a (potentially huge) waste of CPU cycles due to the re-creation of Knuth
elements. There may be side effects that I’ve missed so far. But I think
it’s worth giving it a try.


Will the increase in CPU cycles only occur if there is a change in IPD, 
or will this affect performance for existing users with the same width 
on every page?




What I’m planning to do is create a branch in which I would experiment
that idea. If it turns out to be feasible then we could merge it back to
Trunk, advertise the ‘fix’ and document its limitations. And hopefully
that will relieve some pressure on the implementation of the more
complete new approach.


Thanks,

Chris




Hacking a temporary solution for Changing IPD

2009-06-11 Thread Vincent Hennebert
Hi All,

While I am making rather good progress on my prototype implementation of
a new layout approach, I am not likely to get any practical results
before some time (a year or more). Meanwhile, users keep bumping into
this Changing IPD issue, which is becoming more and more problematic.

It would be good if we had some partial, temporary solution with which
users could live until the new approach has been implemented. In many
cases, the ‘Changing IPD’ problem boils down to the first page of the
sequence having some sort of side region encroaching upon the main
content, and all the following pages having the same width. And the
content is usually made of just plain text (fo:block elements).

I spent some time looking at the current code and it seems to me that
a hack could be implemented without too many difficulties. It basically
consists in 2 steps:
1. in the Knuth breaking algorithm, when creating a new active node,
   look whether the IPD for the following page is the same or not. If
   not, deactivate the node. Once we run out of active nodes, select the
   best of those deactivated nodes and treat it as if it were the
   regular final node. Add areas for content up to that node.

2. re-create Knuth elements, starting from the index corresponding to
   that node. Re-launch the breaking algorithm, starting from there.
   Then back to step 1, until the end of the document is reached.

Step 1 should be doable without turning everything upside down. Step
2 implies to change the signature of the
LayoutManager.getNextKnuthElements method, adding a parameter that
corresponds to the index from where to start the generation of Knuth
elements. We could largely ignore it, except in BlockLayoutManager where
we would re-launch the line-breaking algorithm, taking the new IPD into
account.

Obviously this is a limited approach. There is likely to be
a (potentially huge) waste of CPU cycles due to the re-creation of Knuth
elements. There may be side effects that I’ve missed so far. But I think
it’s worth giving it a try.

What I’m planning to do is create a branch in which I would experiment
that idea. If it turns out to be feasible then we could merge it back to
Trunk, advertise the ‘fix’ and document its limitations. And hopefully
that will relieve some pressure on the implementation of the more
complete new approach.


Thoughts, anyone?
Thanks,
Vincent