Re: Element list generation for tables (special case)

2005-08-02 Thread Jeremias Maerki
Thanks Simon. Now I understand. I've just committed the general fix for
the problem (I think), but have hard-coded the penalty value to 900 for
the moment. I think your idea might be interesting but I'm a little
clueless about the right formula to calculate an appropriate penalty
value. I'll leave that for someone more mathematically gifted. :-) The
code line is clearly marked with a TODO item if anyone wants to try. I'm
happy enough with the result. :-)

On 31.07.2005 12:17:31 Simon Pepping wrote:
> On Sat, Jul 30, 2005 at 03:46:31PM +0200, Jeremias Maerki wrote:
> > Sorry, but I have trouble understanding what you mean. Could you please
> > elaborate with an example? Thanks.
> >
> > On 30.07.2005 13:54:25 Simon Pepping wrote:
> > > On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
> > > > I was under the impression that the breaker automatically favors break
> > > > decisions that take up less space. It even goes so far that if you have
> > > > a minimum="0pt" and an optimum="2opt" on a space-before, that it
> > > > currently chooses "0pt" which is not so good, actually.
> > >
> > > Penalties would help. If there were a penalty associated with the
> > > break below 'B', then the break above it becomes more favourable. I do
> > > not think the breaker could do that otherwise (without the newly
> > > proposed rule).
> 
> If there were a penalty value associated with a break that makes the
> table longer, e.g. 0.1 * w, then the following list would result:
> 
>  8) box w=9600
>  9) penalty p=0 w=0
> 10) box w=28800
> 11) penalty p=0 w=0
> 12) box w=0 //<-- this is where the second row starts
> 13) penalty p=960 w=9600  //this penalty is due to the possible break after 
> "B"
> 14) box w=28800
> 15) penalty p=0 w=0 //this is the next break poss after three lines
> //due to the orphan setting
> 16) box w=28800
> 
> Now a break at 12 would have 960 demerits. A break at 10 would have 0
> demerits, but because it would have less content on the page it would
> have a larger stretch and that would itself associated demerits, say
> 500. Then the break at 10 would be selected.
> 
> In general, the table breaker may select breaks with a skew placement
> of table contents, e.g.
> 
> xxx  |
>|
>--|-
>|
>   xxx' | yyy
> 
> over breaks with a more even placement of table contents, e.g.
> 
> xxx  | yyy'
>|
>--|-
>|
>   xxx' | yyy"
> 
> Such breaks are rather ugly. They also make the table considerably
> longer. One can use the extra length of the table as a measure of
> skew placement and thus of ugliness and of the penalty value
> associated with this break. This makes that breaks with a skew
> placement of content are disfavoured, and only selected when more
> even breaks have lots of demerits themselves, due to other causes.
> 
> Regards, Simon
> 
> --
> Simon Pepping
> home page: http://www.leverkruid.nl



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-08-01 Thread Andreas L Delmelle

Merely FYI: slight correction needed...


On 30.07.2005 15:14:04 Andreas L Delmelle wrote:


Currently, I don't think we already have a mapping of these
object->applicable_props anywhere, ...


We do have such a map in org.apache.fop.fo.PropertySets, but I don't 
get the impression that it is equipped to allow lookup of whether a 
given property is applicable for a given formatting object...



Cheers,

Andreas



Re: Element list generation for tables (special case)

2005-07-31 Thread Simon Pepping
On Sat, Jul 30, 2005 at 03:46:31PM +0200, Jeremias Maerki wrote:
> Sorry, but I have trouble understanding what you mean. Could you please
> elaborate with an example? Thanks.
>
> On 30.07.2005 13:54:25 Simon Pepping wrote:
> > On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
> > > I was under the impression that the breaker automatically favors break
> > > decisions that take up less space. It even goes so far that if you have
> > > a minimum="0pt" and an optimum="2opt" on a space-before, that it
> > > currently chooses "0pt" which is not so good, actually.
> >
> > Penalties would help. If there were a penalty associated with the
> > break below 'B', then the break above it becomes more favourable. I do
> > not think the breaker could do that otherwise (without the newly
> > proposed rule).

If there were a penalty value associated with a break that makes the
table longer, e.g. 0.1 * w, then the following list would result:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=0 //<-- this is where the second row starts
13) penalty p=960 w=9600  //this penalty is due to the possible break after "B"
14) box w=28800
15) penalty p=0 w=0 //this is the next break poss after three lines
//due to the orphan setting
16) box w=28800

Now a break at 12 would have 960 demerits. A break at 10 would have 0
demerits, but because it would have less content on the page it would
have a larger stretch and that would itself associated demerits, say
500. Then the break at 10 would be selected.

In general, the table breaker may select breaks with a skew placement
of table contents, e.g.

xxx  |
 |
   --|-
 |
xxx' | yyy

over breaks with a more even placement of table contents, e.g.

xxx  | yyy'
 |
   --|-
 |
xxx' | yyy"

Such breaks are rather ugly. They also make the table considerably
longer. One can use the extra length of the table as a measure of
skew placement and thus of ugliness and of the penalty value
associated with this break. This makes that breaks with a skew
placement of content are disfavoured, and only selected when more
even breaks have lots of demerits themselves, due to other causes.

Regards, Simon

--
Simon Pepping
home page: http://www.leverkruid.nl



Re: Element list generation for tables (special case)

2005-07-30 Thread Jeremias Maerki
Thanks for looking it up. I've put it on the todo list on the Wiki so it
doesn't get forgotten. It's low priority anyway. It's probably a good
exercise for someone who wants to get into how the FO tree works.

On 30.07.2005 15:14:04 Andreas L Delmelle wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On Jul 30, 2005, at 11:51, Jeremias Maerki wrote:
> 
> > D'oh, right. :-) Lucky me.
> >
> > Too bad, we don't generate validation warnings for misplaced
> > non-inherited properties. Didn't we have that discussion already this
> > year? I can't find it or am I imagining it?
> 
> I also remember this being mentioned... Yep, found it. A thread from 
> about a month ago.
> 
> http://marc.theaimsgroup.com/?l=fop-dev&m=111962589510266&w=2
> 
> As Glen indicated, the XSL-FO Rec starts off by allowing any property 
> on any object, but further on, it does state that for every class of 
> objects there is a specific set of applicable properties.
> 
> Thinking of ideas on implementing such checks... Currently, I don't 
> think we already have a mapping of these object->applicable_props 
> anywhere, and maybe we don't even need such a map. Since the 
> PropertyList is a temporary list anyway, whose individual properties 
> get bound to member variables of the respective objects, is it safe to 
> say that the FObj subclass' member variables --or at least a subset-- 
> corresponds to the set of applicable properties?
> 
> If that is true, what we're looking for seems to be a possibility to 
> check whether the list contains any unbound properties after the call 
> to --or ending-- FObj.bind().
> 
> Shouldn't cost too much, I think.
> 
> Cheers,
> 
> Andreas
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.1 (Darwin)
> 
> iD8DBQFC630myHTbFO9b8aARArkpAJ94BITEvZauAi+oMfRSpStvUPKTywCcCGgG
> mMQvEojfDcJndutFEQtZatA=
> =3Rdr
> -END PGP SIGNATURE-



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-30 Thread Jeremias Maerki

On 30.07.2005 13:07:40 Andreas L Delmelle wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On Jul 29, 2005, at 23:25, Jeremias Maerki wrote:
> 
> > Strike that. Just found a mean case where my quick hack breaks. Back to
> > frame one and a half. It's going to be a bit more difficult.
> 
> FWIW: It occurred to me that, with a break-before="page" on the 
> fo:block in the second column/second row, the result you initially 
> posted would be correct... at least, I think so :-/
> This made me wonder if the rule has to be formulated differently.
> 
> Let's make it: until we reach the first grid unit in the row that has a 
> box that actually causes a break --either through a forced break or 
> imposed by bpd constraints-- all break possibilities in previous grid 
> units are ONLY possibilities.
> Those possibilities need to be taken into consideration, if and only if:
> 1) the breaking grid unit has previous boxes that still fit on the page
> 2) or the break was forced
> (or = inclusive)

Right, important point. I forgot about the hard breaks. My quick hack
would also have failed with those. But I already have another idea how
to fix this without too much effort. I'll try that on Tuesday when my
brain isn't preoccupied with the weekend and the national holiday on
Monday. :-)

> For the following grid units in the same row, we have enough 
> information to decide if we need to break before their first box or 
> not, so they do not necessarily have to have contributed their 'one 
> box'.
> 
> So, IIC, the grid units in a row each have to contribute ALL of their 
> boxes until the first real break (more than a possible break). In the 
> presented case, this comes down to the same thing as saying that they 
> have to contribute one box, but that was a simplified case for 
> demonstration purposes.
> 
> If implementing the rule that way is possible, I think this would hold 
> for most cases.

I agree.

> HTH!

Thanks. It does.


Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-30 Thread Jeremias Maerki
Sorry, but I have trouble understanding what you mean. Could you please
elaborate with an example? Thanks.

On 30.07.2005 13:54:25 Simon Pepping wrote:
> On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
> > I was under the impression that the breaker automatically favors break
> > decisions that take up less space. It even goes so far that if you have
> > a minimum="0pt" and an optimum="2opt" on a space-before, that it
> > currently chooses "0pt" which is not so good, actually.
> 
> Penalties would help. If there were a penalty associated with the
> break below 'B', then the break above it becomes more favourable. I do
> not think the breaker could do that otherwise (without the newly
> proposed rule).
>  
> > Well, we have several documented examples on the Wiki which we could
> > play through to see if the breaker is likely to make bad break decisions.
> > 
> > But I get the impression that this avoids the topic I raised. :-) I
> > think this here is not about whether these special break conditions are
> > favored or avoided but if they should be allowed at all.
> > 
> > On 27.07.2005 21:54:00 Simon Pepping wrote:
> > > One thing that IMHO is still lacking in the table breaking code is
> > > penalty values. ATM all penalties are 0. I believe the penalty value
> > > should depend on the extra vertical size that the break contributes,
> > > that is, on the penalty's width. I have no idea about the
> > > multiplication constant, nor if it should be linear or quadratic. I am
> > > not sure if it avoids the current case, but it is surely needed in
> > > order to favour better breaks over worse ones.
> 
> Regards, Simon
> 
> -- 
> Simon Pepping
> home page: http://www.leverkruid.nl



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-30 Thread Andreas L Delmelle

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jul 30, 2005, at 11:51, Jeremias Maerki wrote:


D'oh, right. :-) Lucky me.

Too bad, we don't generate validation warnings for misplaced
non-inherited properties. Didn't we have that discussion already this
year? I can't find it or am I imagining it?


I also remember this being mentioned... Yep, found it. A thread from 
about a month ago.


http://marc.theaimsgroup.com/?l=fop-dev&m=111962589510266&w=2

As Glen indicated, the XSL-FO Rec starts off by allowing any property 
on any object, but further on, it does state that for every class of 
objects there is a specific set of applicable properties.


Thinking of ideas on implementing such checks... Currently, I don't 
think we already have a mapping of these object->applicable_props 
anywhere, and maybe we don't even need such a map. Since the 
PropertyList is a temporary list anyway, whose individual properties 
get bound to member variables of the respective objects, is it safe to 
say that the FObj subclass' member variables --or at least a subset-- 
corresponds to the set of applicable properties?


If that is true, what we're looking for seems to be a possibility to 
check whether the list contains any unbound properties after the call 
to --or ending-- FObj.bind().


Shouldn't cost too much, I think.

Cheers,

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFC630myHTbFO9b8aARArkpAJ94BITEvZauAi+oMfRSpStvUPKTywCcCGgG
mMQvEojfDcJndutFEQtZatA=
=3Rdr
-END PGP SIGNATURE-



Re: Element list generation for tables (special case)

2005-07-30 Thread Simon Pepping
On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
> I was under the impression that the breaker automatically favors break
> decisions that take up less space. It even goes so far that if you have
> a minimum="0pt" and an optimum="2opt" on a space-before, that it
> currently chooses "0pt" which is not so good, actually.

Penalties would help. If there were a penalty associated with the
break below 'B', then the break above it becomes more favourable. I do
not think the breaker could do that otherwise (without the newly
proposed rule).
 
> Well, we have several documented examples on the Wiki which we could
> play through to see if the breaker is likely to make bad break decisions.
> 
> But I get the impression that this avoids the topic I raised. :-) I
> think this here is not about whether these special break conditions are
> favored or avoided but if they should be allowed at all.
> 
> On 27.07.2005 21:54:00 Simon Pepping wrote:
> > One thing that IMHO is still lacking in the table breaking code is
> > penalty values. ATM all penalties are 0. I believe the penalty value
> > should depend on the extra vertical size that the break contributes,
> > that is, on the penalty's width. I have no idea about the
> > multiplication constant, nor if it should be linear or quadratic. I am
> > not sure if it avoids the current case, but it is surely needed in
> > order to favour better breaks over worse ones.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.nl



Re: Element list generation for tables (special case)

2005-07-30 Thread Simon Pepping
On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
> But I get the impression that this avoids the topic I raised. :-) I
> think this here is not about whether these special break conditions are
> favored or avoided but if they should be allowed at all.

OK. Yes, the rule you propose sounds OK.

Inside a row group, you may limit the rule to those columns which
start a grid unit in this row, and exclude the columns which span into
this row from a previous row.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.nl



Re: Element list generation for tables (special case)

2005-07-30 Thread Andreas L Delmelle

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jul 29, 2005, at 23:25, Jeremias Maerki wrote:


Strike that. Just found a mean case where my quick hack breaks. Back to
frame one and a half. It's going to be a bit more difficult.


FWIW: It occurred to me that, with a break-before="page" on the 
fo:block in the second column/second row, the result you initially 
posted would be correct... at least, I think so :-/

This made me wonder if the rule has to be formulated differently.

Let's make it: until we reach the first grid unit in the row that has a 
box that actually causes a break --either through a forced break or 
imposed by bpd constraints-- all break possibilities in previous grid 
units are ONLY possibilities.

Those possibilities need to be taken into consideration, if and only if:
1) the breaking grid unit has previous boxes that still fit on the page
2) or the break was forced
(or = inclusive)

For the following grid units in the same row, we have enough 
information to decide if we need to break before their first box or 
not, so they do not necessarily have to have contributed their 'one 
box'.


So, IIC, the grid units in a row each have to contribute ALL of their 
boxes until the first real break (more than a possible break). In the 
presented case, this comes down to the same thing as saying that they 
have to contribute one box, but that was a simplified case for 
demonstration purposes.


If implementing the rule that way is possible, I think this would hold 
for most cases.


HTH!

Cheers,

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFC61+LyHTbFO9b8aARAqZ1AKCUS3ZNlH9czbSJhvfAS6PDLy57KQCgsqTw
/J/AVs16QS3GtSTAUcipDMs=
=UxX7
-END PGP SIGNATURE-



Re: Element list generation for tables (special case)

2005-07-30 Thread Jeremias Maerki
D'oh, right. :-) Lucky me.

Too bad, we don't generate validation warnings for misplaced
non-inherited properties. Didn't we have that discussion already this
year? I can't find it or am I imagining it?

On 30.07.2005 03:47:45 Andreas L Delmelle wrote:
> On Jul 28, 2005, at 14:04, Jeremias Maerki wrote:
> 
> > On 28.07.2005 13:42:08 Andreas L Delmelle wrote:
> >>
> >> Where it comes to rowspans:
> >> In my modified example, if you move all the text in the middle column
> >> to the first row and make that cell span two rows, things get a bit
> >> awkward without the proposed rule anyway...
> >
> > Ouch. You definitely hit a bug here. The height calculation rule should
> > have placed the two "B"s right under the "A"s (i.e. first row height =
> > 8pt).
> 
> A bug in me, that's for sure! I misplaced the rowspan property :-)
> 
> Sorry.
> 
> Cheers,
> 
> Andreas



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-29 Thread Andreas L Delmelle

On Jul 28, 2005, at 14:04, Jeremias Maerki wrote:


On 28.07.2005 13:42:08 Andreas L Delmelle wrote:


Where it comes to rowspans:
In my modified example, if you move all the text in the middle column
to the first row and make that cell span two rows, things get a bit
awkward without the proposed rule anyway...


Ouch. You definitely hit a bug here. The height calculation rule should
have placed the two "B"s right under the "A"s (i.e. first row height =
8pt).


A bug in me, that's for sure! I misplaced the rowspan property :-)

Sorry.

Cheers,

Andreas



Re: Element list generation for tables (special case)

2005-07-29 Thread Jeremias Maerki
Strike that. Just found a mean case where my quick hack breaks. Back to
frame one and a half. It's going to be a bit more difficult.

On 29.07.2005 23:04:08 Jeremias Maerki wrote:
> As I suspected, implementing the rule I proposed was very easy to
> implement (13 new lines).


Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-29 Thread Jeremias Maerki

On 28.07.2005 15:12:50 Chris Bowditch wrote:
> I've been following this thread with interest. From a conceptual point 
> of view, I agree with Andreas. I can't see any situation where you might 
> want to have cells of the same row group on separate pages. Regardless 
> of how many rows a particular cell spans.
> 
> Is there nothing in the spec to give you some clues?

I haven't found anything.

As I suspected, implementing the rule I proposed was very easy to
implement (13 new lines). Too bad minotaur is down so I can't commit.

The only thing that my quick work-around can't handle is if someone does
something like this:


  

A
  
  

Some long text whose first three lines will be kept
together by an orphan setting.
  


The text "A" may still be separated from the second block in the second
table-cell because both cells have already contributed a box and a
penalty to the element list of each cell. The algorithm doesn't check if
the contributed boxes have any meaningful content. Obviously, that's the
limit of my current proposal. But it's a start.


Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-28 Thread Chris Bowditch
I've been following this thread with interest. From a conceptual point 
of view, I agree with Andreas. I can't see any situation where you might 
want to have cells of the same row group on separate pages. Regardless 
of how many rows a particular cell spans.


Is there nothing in the spec to give you some clues?



Chris



Re: Element list generation for tables (special case)

2005-07-28 Thread Jeremias Maerki

On 28.07.2005 13:42:08 Andreas L Delmelle wrote:
> On Jul 28, 2005, at 10:10, Jeremias Maerki wrote:
> 
> > On 27.07.2005 23:26:48 Andreas L Delmelle wrote:
> >> Indeed, doesn't look right. Given the value for the orphans property,
> >> one still would reasonably expect the break to occur before the first
> >> cell of the second row.
> >
> > ...or after the first 3 lines of the second row.
> 
> Yes, but IIC, there isn't enough space left on the page to display 
> those, hence 'break before the row'.

Right. I just wanted to point out all the relevant break possibilities.

> 
> > These are all valid possibilities, but as a I hinted I want to discuss
> > this at conceptual level not implementation level. I want to know if we
> > can have a general rule that we don't allow breaks before every cell
> > contributed at least one box to the combined element list. Also, Simon
> > and you are talking about providing higher penalty values, but I asked
> > about allowing a break at all (i.e. INFINITE penalty, or rather no
> > penalty at all, only a box). Considering a penalty value p > requires a decision that such breaks are possible/desirable in the 
> > first
> > place.
> 
> Well, if it is only the conceptual side that matters ATM, I don't see 
> any immediate problem with such a rule.

Ok.

> 
> > Right, but is that rule ok or not from a conceptual view. Are there any
> > cases where it might be bad?
> 
> Definitely OK for me. I can't seem to imagine a situation where this 
> rule might cause undesirable effects. On the contrary, it reminds me of 
> a question Luca raised some time ago when implementing lists, roughly : 
> "Is it desirable that, due to a page-break, label and body end up on 
> different pages?"

D'oh. I missed that. Definitely the same problem.

> I didn't think it was, and I don't think this should 
> happen in case of tables either. The different columns in a row should 
> always be considered together, not one-by-one. So the first column can 
> never be allowed to end up in its entirety on a different page than the 
> second.
> 
> Where it comes to rowspans:
> In my modified example, if you move all the text in the middle column 
> to the first row and make that cell span two rows, things get a bit 
> awkward without the proposed rule anyway. (see attached PDF: the middle 
> cell doesn't span the two rows, and the 'second' row only has two 
> cells, and they're considered to be the first and second cell --while 
> they should actually be the first and third)

Ouch. You definitely hit a bug here. The height calculation rule should
have placed the two "B"s right under the "A"s (i.e. first row height =
8pt).


Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-28 Thread Andreas L Delmelle

On Jul 28, 2005, at 10:10, Jeremias Maerki wrote:


On 27.07.2005 23:26:48 Andreas L Delmelle wrote:

Indeed, doesn't look right. Given the value for the orphans property,
one still would reasonably expect the break to occur before the first
cell of the second row.


...or after the first 3 lines of the second row.


Yes, but IIC, there isn't enough space left on the page to display 
those, hence 'break before the row'.




These are all valid possibilities, but as a I hinted I want to discuss
this at conceptual level not implementation level. I want to know if we
can have a general rule that we don't allow breaks before every cell
contributed at least one box to the combined element list. Also, Simon
and you are talking about providing higher penalty values, but I asked
about allowing a break at all (i.e. INFINITE penalty, or rather no
penalty at all, only a box). Considering a penalty value prequires a decision that such breaks are possible/desirable in the 
first

place.


Well, if it is only the conceptual side that matters ATM, I don't see 
any immediate problem with such a rule.




Right, but is that rule ok or not from a conceptual view. Are there any
cases where it might be bad?


Definitely OK for me. I can't seem to imagine a situation where this 
rule might cause undesirable effects. On the contrary, it reminds me of 
a question Luca raised some time ago when implementing lists, roughly : 
"Is it desirable that, due to a page-break, label and body end up on 
different pages?" I didn't think it was, and I don't think this should 
happen in case of tables either. The different columns in a row should 
always be considered together, not one-by-one. So the first column can 
never be allowed to end up in its entirety on a different page than the 
second.


Where it comes to rowspans:
In my modified example, if you move all the text in the middle column 
to the first row and make that cell span two rows, things get a bit 
awkward without the proposed rule anyway. (see attached PDF: the middle 
cell doesn't span the two rows, and the 'second' row only has two 
cells, and they're considered to be the first and second cell --while 
they should actually be the first and third)




table-body4c.xml.head.pdf
Description: Adobe PDF document





table-body4c.xml
Description: application/text




Greetz,

AD


PGP.sig
Description: This is a digitally signed message part


Re: Element list generation for tables (special case)

2005-07-28 Thread Jeremias Maerki

On 27.07.2005 23:26:48 Andreas L Delmelle wrote:
> On Jul 27, 2005, at 20:45, Jeremias Maerki wrote:
> 
> Hi,
> 
> > I got a test case for tables which raises not a technical but rather a
> > interesting conceptual question. Please have a look at the attached  
> > test
> > case. It defines a table with two columns and two rows. In the given
> > setup the second row creates an break decision with the current code  
> > that
> > can be argued as being bad (see the PDF).
> 
> Indeed, doesn't look right. Given the value for the orphans property,  
> one still would reasonably expect the break to occur before the first  
> cell of the second row.

...or after the first 3 lines of the second row.

> BTW: tried adding a third column mirroring the first, and this leads to  
> ONLY the second column being moved to the next page... This as a  
> further demonstration that the currently produced result still leaves a  
> bit to be desired. (see attach)

That was to be expected because the element list from the first and
third column will likely be that same and therefore won't produce a
different combined element list.

> > Here's an excerpt from the element list:
> >
> >  8) box w=9600
> >  9) penalty p=0 w=0
> > 10) box w=28800
> > 11) penalty p=0 w=0
> > 12) box w=0 //<-- this is where the second row starts
> > 13) penalty p=0 w=9600  //this penalty is due to the possible break  
> > after "B"
> > 14) box w=28800
> > 15) penalty p=0 w=0 //this is the next break poss after three lines
> > //due to the orphan setting
> > 16) box w=28800
> >
> > While working on element list generation for tables I came across this
> > question and decided not to do anything about it, especially since
> > removing some of these break possibilities might not be desirable in  
> > all
> > cases.
> >
> > A rule that could be easily implemented would be that we allow the  
> > first
> > break possibility only after every cell in a new row contributed at
> > least one of its own boxes to the combined element list.
> 
> So IOW, if I get this correctly: all break possibilities are to be  
> considered preliminary until the last cell occupying this row (= last  
> grid-unit in the row) has been taken into account?

Almost. In different words again: this means the first step is only
after each newly started cell in a new row contributes at least one box
to the combined element list. I wouldn't want to work with something
like a preliminary break possibility as it suggests that you somehow
have to revisit the list. I'd rather improve the getNextStep method to
only return for the first time after the above rule is met.

> > An example: If you look at page 1 of [1], step 1 would over ignored. On
> > page 3 of [1], the steps 1 and 2 would be ignored.
> > [1]  
> > http://people.apache.org/~jeremias/fop/ 
> > KnuthBoxesForTablesWithBorders.pdf
> 
> Hmm... Do you mean that the steps would be performed but their results  
> discarded, or that the steps simply would not be performed at all?

Not performed at all. See above.

> I'd think the first, but just want to make sure...
> 
> Are the break possibilities currently considered only at the level of  
> the table body --so the element list contains the elements for the  
> cells' boxes, but no separate elements/indicators of row-boundaries?

We seem to have a different word set for expressing this. I don't think
we can say that the breaks are considered at table body level. And you
have to be careful about with element list you speak: the individual
cell element lists or the effective combined element list. Let me
explain how this is implemented:

The TableRowIterator simply provides effective rows with grid units.
For TableContentLM it chooses an array of effective rows which forms a
row group so that no column-spanned cell is split between groups. See
the Wiki for details. Such a row group is the minimal work item for
combining element lists. There is always a break possibility before and
after a row group (except if there is a keep constraint on a row, for
example). Inside a row group the break possibilities are determined by
the getNextStep() method where the combined element list is created.

> In that case --with the risk of underestimating the complexity of what  
> I propose--, perhaps an alternative to the suggested rule would be to  
> insert a step that combines the generated boxes/penalties only after  
> the element list for the last grid unit in a logical row has been  
> created (?) Anyway, instead of simply ignoring those steps, we could  
> also increase the penalty value for the offending break possibility  
> (currently: p=0 for all of them)
> So, IOW, for each row, store the element lists, and after all lists are  
> available, review the calculated penalty values, increasing them when a  
> given break possibility has undesirable consequences when the other  
> element lists for the row are taken into account.
> Or the other way around: 

Re: Element list generation for tables (special case)

2005-07-27 Thread Andreas L Delmelle


Sorry, forgot the attachment...



table-body4b.xml.head.pdf
Description: Adobe PDF document


On Jul 27, 2005, at 23:26, Andreas L Delmelle wrote:


On Jul 27, 2005, at 20:45, Jeremias Maerki wrote:

Hi,


I got a test case for tables which raises not a technical but rather a
interesting conceptual question. Please have a look at the attached  
test

case. It defines a table with two columns and two rows. In the given
setup the second row creates an break decision with the current code  
that

can be argued as being bad (see the PDF).


Indeed, doesn't look right. Given the value for the orphans property,  
one still would reasonably expect the break to occur before the first  
cell of the second row.


BTW: tried adding a third column mirroring the first, and this leads  
to ONLY the second column being moved to the next page... This as a  
further demonstration that the currently produced result still leaves  
a bit to be desired. (see attach)



Here's an excerpt from the element list:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=0 //<-- this is where the second row starts
13) penalty p=0 w=9600  //this penalty is due to the possible break  
after "B"

14) box w=28800
15) penalty p=0 w=0 //this is the next break poss after three  
lines

//due to the orphan setting
16) box w=28800

While working on element list generation for tables I came across this
question and decided not to do anything about it, especially since
removing some of these break possibilities might not be desirable in  
all

cases.

A rule that could be easily implemented would be that we allow the  
first

break possibility only after every cell in a new row contributed at
least one of its own boxes to the combined element list.


So IOW, if I get this correctly: all break possibilities are to be  
considered preliminary until the last cell occupying this row (= last  
grid-unit in the row) has been taken into account?


An example: If you look at page 1 of [1], step 1 would over ignored.  
On

page 3 of [1], the steps 1 and 2 would be ignored.
[1]  
http://people.apache.org/~jeremias/fop/ 
KnuthBoxesForTablesWithBorders.pdf


Hmm... Do you mean that the steps would be performed but their results  
discarded, or that the steps simply would not be performed at all?

I'd think the first, but just want to make sure...

Are the break possibilities currently considered only at the level of  
the table body --so the element list contains the elements for the  
cells' boxes, but no separate elements/indicators of row-boundaries?


In that case --with the risk of underestimating the complexity of what  
I propose--, perhaps an alternative to the suggested rule would be to  
insert a step that combines the generated boxes/penalties only after  
the element list for the last grid unit in a logical row has been  
created (?) Anyway, instead of simply ignoring those steps, we could  
also increase the penalty value for the offending break possibility  
(currently: p=0 for all of them)
So, IOW, for each row, store the element lists, and after all lists  
are available, review the calculated penalty values, increasing them  
when a given break possibility has undesirable consequences when the  
other element lists for the row are taken into account.
Or the other way around: give them a default penalty value that is  
high enough, then afterwards decreasing them for the most favorable  
break possibilities.

Or modify all boxes' widths (=heights) to be equal to the largest box.
After this step is completed, add the combined element list to the  
body.


IIC, the two separate element lists for the second row would be:

First grid unit:
1) box w=9600
2) penalty p=0 w=0

Second grid unit:
1) box w=28800
2) penalty p=0 w=0

So, compare the first boxes' widths and, since the first box in the  
first list is smaller than that in the second list, either increase  
the penalty value for the second step in the first list, or change the  
width of the first box in the first list. Maybe the latter is more  
attractive, since the resulting combined list can then be created by  
concatenating the two separate lists...


[Admitted: this particular case is rather simple, since both lists  
only have one box.]


Then combine the lists to arrive at the result below:


With this rule the element list would look like this:




12) box w=28800 //<-- this is where the second row starts
13) penalty p=0 w=0
14) box w=28800

I'm unsure ATM what this would mean for cases with row spanning,  
though.


As long as the criterion is that every _grid unit_ for the (logical)  
row in question must have contributed at least one box, I wouldn't  
expect any particular problem.


I can see that this new rule would make this better in most cases.  
What

worries me is that there might be cases where we wouldn't want that
behaviour, although ATM I can't see them. So I just want to check with
you tha

Re: Element list generation for tables (special case)

2005-07-27 Thread Andreas L Delmelle

On Jul 27, 2005, at 20:45, Jeremias Maerki wrote:

Hi,


I got a test case for tables which raises not a technical but rather a
interesting conceptual question. Please have a look at the attached  
test

case. It defines a table with two columns and two rows. In the given
setup the second row creates an break decision with the current code  
that

can be argued as being bad (see the PDF).


Indeed, doesn't look right. Given the value for the orphans property,  
one still would reasonably expect the break to occur before the first  
cell of the second row.


BTW: tried adding a third column mirroring the first, and this leads to  
ONLY the second column being moved to the next page... This as a  
further demonstration that the currently produced result still leaves a  
bit to be desired. (see attach)



Here's an excerpt from the element list:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=0 //<-- this is where the second row starts
13) penalty p=0 w=9600  //this penalty is due to the possible break  
after "B"

14) box w=28800
15) penalty p=0 w=0 //this is the next break poss after three lines
//due to the orphan setting
16) box w=28800

While working on element list generation for tables I came across this
question and decided not to do anything about it, especially since
removing some of these break possibilities might not be desirable in  
all

cases.

A rule that could be easily implemented would be that we allow the  
first

break possibility only after every cell in a new row contributed at
least one of its own boxes to the combined element list.


So IOW, if I get this correctly: all break possibilities are to be  
considered preliminary until the last cell occupying this row (= last  
grid-unit in the row) has been taken into account?



An example: If you look at page 1 of [1], step 1 would over ignored. On
page 3 of [1], the steps 1 and 2 would be ignored.
[1]  
http://people.apache.org/~jeremias/fop/ 
KnuthBoxesForTablesWithBorders.pdf


Hmm... Do you mean that the steps would be performed but their results  
discarded, or that the steps simply would not be performed at all?

I'd think the first, but just want to make sure...

Are the break possibilities currently considered only at the level of  
the table body --so the element list contains the elements for the  
cells' boxes, but no separate elements/indicators of row-boundaries?


In that case --with the risk of underestimating the complexity of what  
I propose--, perhaps an alternative to the suggested rule would be to  
insert a step that combines the generated boxes/penalties only after  
the element list for the last grid unit in a logical row has been  
created (?) Anyway, instead of simply ignoring those steps, we could  
also increase the penalty value for the offending break possibility  
(currently: p=0 for all of them)
So, IOW, for each row, store the element lists, and after all lists are  
available, review the calculated penalty values, increasing them when a  
given break possibility has undesirable consequences when the other  
element lists for the row are taken into account.
Or the other way around: give them a default penalty value that is high  
enough, then afterwards decreasing them for the most favorable break  
possibilities.

Or modify all boxes' widths (=heights) to be equal to the largest box.
After this step is completed, add the combined element list to the body.

IIC, the two separate element lists for the second row would be:

First grid unit:
1) box w=9600
2) penalty p=0 w=0

Second grid unit:
1) box w=28800
2) penalty p=0 w=0

So, compare the first boxes' widths and, since the first box in the  
first list is smaller than that in the second list, either increase the  
penalty value for the second step in the first list, or change the  
width of the first box in the first list. Maybe the latter is more  
attractive, since the resulting combined list can then be created by  
concatenating the two separate lists...


[Admitted: this particular case is rather simple, since both lists only  
have one box.]


Then combine the lists to arrive at the result below:


With this rule the element list would look like this:




12) box w=28800 //<-- this is where the second row starts
13) penalty p=0 w=0
14) box w=28800

I'm unsure ATM what this would mean for cases with row spanning,  
though.


As long as the criterion is that every _grid unit_ for the (logical)  
row in question must have contributed at least one box, I wouldn't  
expect any particular problem.



I can see that this new rule would make this better in most cases. What
worries me is that there might be cases where we wouldn't want that
behaviour, although ATM I can't see them. So I just want to check with
you that I haven't forgotten about anything. Or maybe someone has a
better rule to implement this. Thoughts welcome.



Greetz,

AD



Re: Element list generation for tables (special case)

2005-07-27 Thread Jeremias Maerki
I was under the impression that the breaker automatically favors break
decisions that take up less space. It even goes so far that if you have
a minimum="0pt" and an optimum="2opt" on a space-before, that it
currently chooses "0pt" which is not so good, actually.

Well, we have several documented examples on the Wiki which we could
play through to see if the breaker is likely to make bad break decisions.

But I get the impression that this avoids the topic I raised. :-) I
think this here is not about whether these special break conditions are
favored or avoided but if they should be allowed at all.

On 27.07.2005 21:54:00 Simon Pepping wrote:
> One thing that IMHO is still lacking in the table breaking code is
> penalty values. ATM all penalties are 0. I believe the penalty value
> should depend on the extra vertical size that the break contributes,
> that is, on the penalty's width. I have no idea about the
> multiplication constant, nor if it should be linear or quadratic. I am
> not sure if it avoids the current case, but it is surely needed in
> order to favour better breaks over worse ones.
> 
> Simon
> 
> On Wed, Jul 27, 2005 at 08:45:48PM +0200, Jeremias Maerki wrote:
> > I got a test case for tables which raises not a technical but rather a
> > interesting conceptual question. Please have a look at the attached test
> > case. It defines a table with two columns and two rows. In the given
> > setup the second row creates an break decision with the current code that
> > can be argued as being bad (see the PDF). Here's an excerpt from the
> > element list:
> > 
> >  8) box w=9600
> >  9) penalty p=0 w=0
> > 10) box w=28800
> > 11) penalty p=0 w=0
> > 12) box w=0 //<-- this is where the second row starts
> > 13) penalty p=0 w=9600  //this penalty is due to the possible break after 
> > "B"
> > 14) box w=28800
> > 15) penalty p=0 w=0 //this is the next break poss after three lines
> > //due to the orphan setting
> > 16) box w=28800
> > 
> > While working on element list generation for tables I came across this
> > question and decided not to do anything about it, especially since
> > removing some of these break possibilities might not be desirable in all
> > cases.
> > 
> > A rule that could be easily implemented would be that we allow the first
> > break possibility only after every cell in a new row contributed at
> > least one of its own boxes to the combined element list.
> > 
> > An example: If you look at page 1 of [1], step 1 would over ignored. On
> > page 3 of [1], the steps 1 and 2 would be ignored.
> > 
> > [1] 
> > http://people.apache.org/~jeremias/fop/KnuthBoxesForTablesWithBorders.pdf
> > 
> > With this rule the element list would look like this:
> > 
> >  8) box w=9600
> >  9) penalty p=0 w=0
> > 10) box w=28800
> > 11) penalty p=0 w=0
> > 12) box w=28800 //<-- this is where the second row starts
> > 13) penalty p=0 w=0
> > 14) box w=28800
> > 
> > I'm unsure ATM what this would mean for cases with row spanning, though.
> > 
> > I can see that this new rule would make this better in most cases. What
> > worries me is that there might be cases where we wouldn't want that
> > behaviour, although ATM I can't see them. So I just want to check with
> > you that I haven't forgotten about anything. Or maybe someone has a
> > better rule to implement this. Thoughts welcome.
> > 
> > 
> > Jeremias Maerki
> 
> 
> 
> 
> -- 
> Simon Pepping
> home page: http://www.leverkruid.nl



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-27 Thread Simon Pepping
One thing that IMHO is still lacking in the table breaking code is
penalty values. ATM all penalties are 0. I believe the penalty value
should depend on the extra vertical size that the break contributes,
that is, on the penalty's width. I have no idea about the
multiplication constant, nor if it should be linear or quadratic. I am
not sure if it avoids the current case, but it is surely needed in
order to favour better breaks over worse ones.

Simon

On Wed, Jul 27, 2005 at 08:45:48PM +0200, Jeremias Maerki wrote:
> I got a test case for tables which raises not a technical but rather a
> interesting conceptual question. Please have a look at the attached test
> case. It defines a table with two columns and two rows. In the given
> setup the second row creates an break decision with the current code that
> can be argued as being bad (see the PDF). Here's an excerpt from the
> element list:
> 
>  8) box w=9600
>  9) penalty p=0 w=0
> 10) box w=28800
> 11) penalty p=0 w=0
> 12) box w=0 //<-- this is where the second row starts
> 13) penalty p=0 w=9600  //this penalty is due to the possible break after "B"
> 14) box w=28800
> 15) penalty p=0 w=0 //this is the next break poss after three lines
> //due to the orphan setting
> 16) box w=28800
> 
> While working on element list generation for tables I came across this
> question and decided not to do anything about it, especially since
> removing some of these break possibilities might not be desirable in all
> cases.
> 
> A rule that could be easily implemented would be that we allow the first
> break possibility only after every cell in a new row contributed at
> least one of its own boxes to the combined element list.
> 
> An example: If you look at page 1 of [1], step 1 would over ignored. On
> page 3 of [1], the steps 1 and 2 would be ignored.
> 
> [1] http://people.apache.org/~jeremias/fop/KnuthBoxesForTablesWithBorders.pdf
> 
> With this rule the element list would look like this:
> 
>  8) box w=9600
>  9) penalty p=0 w=0
> 10) box w=28800
> 11) penalty p=0 w=0
> 12) box w=28800 //<-- this is where the second row starts
> 13) penalty p=0 w=0
> 14) box w=28800
> 
> I'm unsure ATM what this would mean for cases with row spanning, though.
> 
> I can see that this new rule would make this better in most cases. What
> worries me is that there might be cases where we wouldn't want that
> behaviour, although ATM I can't see them. So I just want to check with
> you that I haven't forgotten about anything. Or maybe someone has a
> better rule to implement this. Thoughts welcome.
> 
> 
> Jeremias Maerki




-- 
Simon Pepping
home page: http://www.leverkruid.nl