Re: Element list generation for tables (special case)

2005-08-02 Thread Jeremias Maerki
Thanks Simon. Now I understand. I've just committed the general fix for
the problem (I think), but have hard-coded the penalty value to 900 for
the moment. I think your idea might be interesting but I'm a little
clueless about the right formula to calculate an appropriate penalty
value. I'll leave that for someone more mathematically gifted. :-) The
code line is clearly marked with a TODO item if anyone wants to try. I'm
happy enough with the result. :-)

On 31.07.2005 12:17:31 Simon Pepping wrote:
 On Sat, Jul 30, 2005 at 03:46:31PM +0200, Jeremias Maerki wrote:
  Sorry, but I have trouble understanding what you mean. Could you please
  elaborate with an example? Thanks.
 
  On 30.07.2005 13:54:25 Simon Pepping wrote:
   On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
I was under the impression that the breaker automatically favors break
decisions that take up less space. It even goes so far that if you have
a minimum=0pt and an optimum=2opt on a space-before, that it
currently chooses 0pt which is not so good, actually.
  
   Penalties would help. If there were a penalty associated with the
   break below 'B', then the break above it becomes more favourable. I do
   not think the breaker could do that otherwise (without the newly
   proposed rule).
 
 If there were a penalty value associated with a break that makes the
 table longer, e.g. 0.1 * w, then the following list would result:
 
  8) box w=9600
  9) penalty p=0 w=0
 10) box w=28800
 11) penalty p=0 w=0
 12) box w=0 //-- this is where the second row starts
 13) penalty p=960 w=9600  //this penalty is due to the possible break after 
 B
 14) box w=28800
 15) penalty p=0 w=0 //this is the next break poss after three lines
 //due to the orphan setting
 16) box w=28800
 
 Now a break at 12 would have 960 demerits. A break at 10 would have 0
 demerits, but because it would have less content on the page it would
 have a larger stretch and that would itself associated demerits, say
 500. Then the break at 10 would be selected.
 
 In general, the table breaker may select breaks with a skew placement
 of table contents, e.g.
 
 xxx  |
|
--|-
|
   xxx' | yyy
 
 over breaks with a more even placement of table contents, e.g.
 
 xxx  | yyy'
|
--|-
|
   xxx' | yyy
 
 Such breaks are rather ugly. They also make the table considerably
 longer. One can use the extra length of the table as a measure of
 skew placement and thus of ugliness and of the penalty value
 associated with this break. This makes that breaks with a skew
 placement of content are disfavoured, and only selected when more
 even breaks have lots of demerits themselves, due to other causes.
 
 Regards, Simon
 
 --
 Simon Pepping
 home page: http://www.leverkruid.nl



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-08-01 Thread Andreas L Delmelle

Merely FYI: slight correction needed...


On 30.07.2005 15:14:04 Andreas L Delmelle wrote:


Currently, I don't think we already have a mapping of these
object-applicable_props anywhere, ...


We do have such a map in org.apache.fop.fo.PropertySets, but I don't 
get the impression that it is equipped to allow lookup of whether a 
given property is applicable for a given formatting object...



Cheers,

Andreas



Re: Element list generation for tables (special case)

2005-07-31 Thread Simon Pepping
On Sat, Jul 30, 2005 at 03:46:31PM +0200, Jeremias Maerki wrote:
 Sorry, but I have trouble understanding what you mean. Could you please
 elaborate with an example? Thanks.

 On 30.07.2005 13:54:25 Simon Pepping wrote:
  On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
   I was under the impression that the breaker automatically favors break
   decisions that take up less space. It even goes so far that if you have
   a minimum=0pt and an optimum=2opt on a space-before, that it
   currently chooses 0pt which is not so good, actually.
 
  Penalties would help. If there were a penalty associated with the
  break below 'B', then the break above it becomes more favourable. I do
  not think the breaker could do that otherwise (without the newly
  proposed rule).

If there were a penalty value associated with a break that makes the
table longer, e.g. 0.1 * w, then the following list would result:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=0 //-- this is where the second row starts
13) penalty p=960 w=9600  //this penalty is due to the possible break after B
14) box w=28800
15) penalty p=0 w=0 //this is the next break poss after three lines
//due to the orphan setting
16) box w=28800

Now a break at 12 would have 960 demerits. A break at 10 would have 0
demerits, but because it would have less content on the page it would
have a larger stretch and that would itself associated demerits, say
500. Then the break at 10 would be selected.

In general, the table breaker may select breaks with a skew placement
of table contents, e.g.

xxx  |
 |
   --|-
 |
xxx' | yyy

over breaks with a more even placement of table contents, e.g.

xxx  | yyy'
 |
   --|-
 |
xxx' | yyy

Such breaks are rather ugly. They also make the table considerably
longer. One can use the extra length of the table as a measure of
skew placement and thus of ugliness and of the penalty value
associated with this break. This makes that breaks with a skew
placement of content are disfavoured, and only selected when more
even breaks have lots of demerits themselves, due to other causes.

Regards, Simon

--
Simon Pepping
home page: http://www.leverkruid.nl



Re: Element list generation for tables (special case)

2005-07-30 Thread Jeremias Maerki
D'oh, right. :-) Lucky me.

Too bad, we don't generate validation warnings for misplaced
non-inherited properties. Didn't we have that discussion already this
year? I can't find it or am I imagining it?

On 30.07.2005 03:47:45 Andreas L Delmelle wrote:
 On Jul 28, 2005, at 14:04, Jeremias Maerki wrote:
 
  On 28.07.2005 13:42:08 Andreas L Delmelle wrote:
 
  Where it comes to rowspans:
  In my modified example, if you move all the text in the middle column
  to the first row and make that cell span two rows, things get a bit
  awkward without the proposed rule anyway...
 
  Ouch. You definitely hit a bug here. The height calculation rule should
  have placed the two Bs right under the As (i.e. first row height =
  8pt).
 
 A bug in me, that's for sure! I misplaced the rowspan property :-)
 
 Sorry.
 
 Cheers,
 
 Andreas



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-30 Thread Andreas L Delmelle

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jul 29, 2005, at 23:25, Jeremias Maerki wrote:


Strike that. Just found a mean case where my quick hack breaks. Back to
frame one and a half. It's going to be a bit more difficult.


FWIW: It occurred to me that, with a break-before=page on the 
fo:block in the second column/second row, the result you initially 
posted would be correct... at least, I think so :-/

This made me wonder if the rule has to be formulated differently.

Let's make it: until we reach the first grid unit in the row that has a 
box that actually causes a break --either through a forced break or 
imposed by bpd constraints-- all break possibilities in previous grid 
units are ONLY possibilities.

Those possibilities need to be taken into consideration, if and only if:
1) the breaking grid unit has previous boxes that still fit on the page
2) or the break was forced
(or = inclusive)

For the following grid units in the same row, we have enough 
information to decide if we need to break before their first box or 
not, so they do not necessarily have to have contributed their 'one 
box'.


So, IIC, the grid units in a row each have to contribute ALL of their 
boxes until the first real break (more than a possible break). In the 
presented case, this comes down to the same thing as saying that they 
have to contribute one box, but that was a simplified case for 
demonstration purposes.


If implementing the rule that way is possible, I think this would hold 
for most cases.


HTH!

Cheers,

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFC61+LyHTbFO9b8aARAqZ1AKCUS3ZNlH9czbSJhvfAS6PDLy57KQCgsqTw
/J/AVs16QS3GtSTAUcipDMs=
=UxX7
-END PGP SIGNATURE-



Re: Element list generation for tables (special case)

2005-07-30 Thread Simon Pepping
On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
 But I get the impression that this avoids the topic I raised. :-) I
 think this here is not about whether these special break conditions are
 favored or avoided but if they should be allowed at all.

OK. Yes, the rule you propose sounds OK.

Inside a row group, you may limit the rule to those columns which
start a grid unit in this row, and exclude the columns which span into
this row from a previous row.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.nl



Re: Element list generation for tables (special case)

2005-07-30 Thread Simon Pepping
On Wed, Jul 27, 2005 at 10:40:25PM +0200, Jeremias Maerki wrote:
 I was under the impression that the breaker automatically favors break
 decisions that take up less space. It even goes so far that if you have
 a minimum=0pt and an optimum=2opt on a space-before, that it
 currently chooses 0pt which is not so good, actually.

Penalties would help. If there were a penalty associated with the
break below 'B', then the break above it becomes more favourable. I do
not think the breaker could do that otherwise (without the newly
proposed rule).
 
 Well, we have several documented examples on the Wiki which we could
 play through to see if the breaker is likely to make bad break decisions.
 
 But I get the impression that this avoids the topic I raised. :-) I
 think this here is not about whether these special break conditions are
 favored or avoided but if they should be allowed at all.
 
 On 27.07.2005 21:54:00 Simon Pepping wrote:
  One thing that IMHO is still lacking in the table breaking code is
  penalty values. ATM all penalties are 0. I believe the penalty value
  should depend on the extra vertical size that the break contributes,
  that is, on the penalty's width. I have no idea about the
  multiplication constant, nor if it should be linear or quadratic. I am
  not sure if it avoids the current case, but it is surely needed in
  order to favour better breaks over worse ones.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.nl



Re: Element list generation for tables (special case)

2005-07-30 Thread Andreas L Delmelle

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jul 30, 2005, at 11:51, Jeremias Maerki wrote:


D'oh, right. :-) Lucky me.

Too bad, we don't generate validation warnings for misplaced
non-inherited properties. Didn't we have that discussion already this
year? I can't find it or am I imagining it?


I also remember this being mentioned... Yep, found it. A thread from 
about a month ago.


http://marc.theaimsgroup.com/?l=fop-devm=111962589510266w=2

As Glen indicated, the XSL-FO Rec starts off by allowing any property 
on any object, but further on, it does state that for every class of 
objects there is a specific set of applicable properties.


Thinking of ideas on implementing such checks... Currently, I don't 
think we already have a mapping of these object-applicable_props 
anywhere, and maybe we don't even need such a map. Since the 
PropertyList is a temporary list anyway, whose individual properties 
get bound to member variables of the respective objects, is it safe to 
say that the FObj subclass' member variables --or at least a subset-- 
corresponds to the set of applicable properties?


If that is true, what we're looking for seems to be a possibility to 
check whether the list contains any unbound properties after the call 
to --or ending-- FObj.bind().


Shouldn't cost too much, I think.

Cheers,

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFC630myHTbFO9b8aARArkpAJ94BITEvZauAi+oMfRSpStvUPKTywCcCGgG
mMQvEojfDcJndutFEQtZatA=
=3Rdr
-END PGP SIGNATURE-



Re: Element list generation for tables (special case)

2005-07-30 Thread Jeremias Maerki

On 30.07.2005 13:07:40 Andreas L Delmelle wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On Jul 29, 2005, at 23:25, Jeremias Maerki wrote:
 
  Strike that. Just found a mean case where my quick hack breaks. Back to
  frame one and a half. It's going to be a bit more difficult.
 
 FWIW: It occurred to me that, with a break-before=page on the 
 fo:block in the second column/second row, the result you initially 
 posted would be correct... at least, I think so :-/
 This made me wonder if the rule has to be formulated differently.
 
 Let's make it: until we reach the first grid unit in the row that has a 
 box that actually causes a break --either through a forced break or 
 imposed by bpd constraints-- all break possibilities in previous grid 
 units are ONLY possibilities.
 Those possibilities need to be taken into consideration, if and only if:
 1) the breaking grid unit has previous boxes that still fit on the page
 2) or the break was forced
 (or = inclusive)

Right, important point. I forgot about the hard breaks. My quick hack
would also have failed with those. But I already have another idea how
to fix this without too much effort. I'll try that on Tuesday when my
brain isn't preoccupied with the weekend and the national holiday on
Monday. :-)

 For the following grid units in the same row, we have enough 
 information to decide if we need to break before their first box or 
 not, so they do not necessarily have to have contributed their 'one 
 box'.
 
 So, IIC, the grid units in a row each have to contribute ALL of their 
 boxes until the first real break (more than a possible break). In the 
 presented case, this comes down to the same thing as saying that they 
 have to contribute one box, but that was a simplified case for 
 demonstration purposes.
 
 If implementing the rule that way is possible, I think this would hold 
 for most cases.

I agree.

 HTH!

Thanks. It does.


Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-30 Thread Jeremias Maerki
Thanks for looking it up. I've put it on the todo list on the Wiki so it
doesn't get forgotten. It's low priority anyway. It's probably a good
exercise for someone who wants to get into how the FO tree works.

On 30.07.2005 15:14:04 Andreas L Delmelle wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On Jul 30, 2005, at 11:51, Jeremias Maerki wrote:
 
  D'oh, right. :-) Lucky me.
 
  Too bad, we don't generate validation warnings for misplaced
  non-inherited properties. Didn't we have that discussion already this
  year? I can't find it or am I imagining it?
 
 I also remember this being mentioned... Yep, found it. A thread from 
 about a month ago.
 
 http://marc.theaimsgroup.com/?l=fop-devm=111962589510266w=2
 
 As Glen indicated, the XSL-FO Rec starts off by allowing any property 
 on any object, but further on, it does state that for every class of 
 objects there is a specific set of applicable properties.
 
 Thinking of ideas on implementing such checks... Currently, I don't 
 think we already have a mapping of these object-applicable_props 
 anywhere, and maybe we don't even need such a map. Since the 
 PropertyList is a temporary list anyway, whose individual properties 
 get bound to member variables of the respective objects, is it safe to 
 say that the FObj subclass' member variables --or at least a subset-- 
 corresponds to the set of applicable properties?
 
 If that is true, what we're looking for seems to be a possibility to 
 check whether the list contains any unbound properties after the call 
 to --or ending-- FObj.bind().
 
 Shouldn't cost too much, I think.
 
 Cheers,
 
 Andreas
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.1 (Darwin)
 
 iD8DBQFC630myHTbFO9b8aARArkpAJ94BITEvZauAi+oMfRSpStvUPKTywCcCGgG
 mMQvEojfDcJndutFEQtZatA=
 =3Rdr
 -END PGP SIGNATURE-



Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-29 Thread Jeremias Maerki

On 28.07.2005 15:12:50 Chris Bowditch wrote:
 I've been following this thread with interest. From a conceptual point 
 of view, I agree with Andreas. I can't see any situation where you might 
 want to have cells of the same row group on separate pages. Regardless 
 of how many rows a particular cell spans.
 
 Is there nothing in the spec to give you some clues?

I haven't found anything.

As I suspected, implementing the rule I proposed was very easy to
implement (13 new lines). Too bad minotaur is down so I can't commit.

The only thing that my quick work-around can't handle is if someone does
something like this:

fo:table-row
  fo:table-cell
fo:blockfo:leader//fo:block
fo:blockA/fo:block
  /fo:table-cell
  fo:table-cell
fo:blockfo:leader//fo:block
fo:blockSome long text whose first three lines will be kept
together by an orphan setting./fo:block
  /fo:table-cell
/fo:table-row

The text A may still be separated from the second block in the second
table-cell because both cells have already contributed a box and a
penalty to the element list of each cell. The algorithm doesn't check if
the contributed boxes have any meaningful content. Obviously, that's the
limit of my current proposal. But it's a start.


Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-28 Thread Jeremias Maerki

On 27.07.2005 23:26:48 Andreas L Delmelle wrote:
 On Jul 27, 2005, at 20:45, Jeremias Maerki wrote:
 
 Hi,
 
  I got a test case for tables which raises not a technical but rather a
  interesting conceptual question. Please have a look at the attached  
  test
  case. It defines a table with two columns and two rows. In the given
  setup the second row creates an break decision with the current code  
  that
  can be argued as being bad (see the PDF).
 
 Indeed, doesn't look right. Given the value for the orphans property,  
 one still would reasonably expect the break to occur before the first  
 cell of the second row.

...or after the first 3 lines of the second row.

 BTW: tried adding a third column mirroring the first, and this leads to  
 ONLY the second column being moved to the next page... This as a  
 further demonstration that the currently produced result still leaves a  
 bit to be desired. (see attach)

That was to be expected because the element list from the first and
third column will likely be that same and therefore won't produce a
different combined element list.

  Here's an excerpt from the element list:
 
   8) box w=9600
   9) penalty p=0 w=0
  10) box w=28800
  11) penalty p=0 w=0
  12) box w=0 //-- this is where the second row starts
  13) penalty p=0 w=9600  //this penalty is due to the possible break  
  after B
  14) box w=28800
  15) penalty p=0 w=0 //this is the next break poss after three lines
  //due to the orphan setting
  16) box w=28800
 
  While working on element list generation for tables I came across this
  question and decided not to do anything about it, especially since
  removing some of these break possibilities might not be desirable in  
  all
  cases.
 
  A rule that could be easily implemented would be that we allow the  
  first
  break possibility only after every cell in a new row contributed at
  least one of its own boxes to the combined element list.
 
 So IOW, if I get this correctly: all break possibilities are to be  
 considered preliminary until the last cell occupying this row (= last  
 grid-unit in the row) has been taken into account?

Almost. In different words again: this means the first step is only
after each newly started cell in a new row contributes at least one box
to the combined element list. I wouldn't want to work with something
like a preliminary break possibility as it suggests that you somehow
have to revisit the list. I'd rather improve the getNextStep method to
only return for the first time after the above rule is met.

  An example: If you look at page 1 of [1], step 1 would over ignored. On
  page 3 of [1], the steps 1 and 2 would be ignored.
  [1]  
  http://people.apache.org/~jeremias/fop/ 
  KnuthBoxesForTablesWithBorders.pdf
 
 Hmm... Do you mean that the steps would be performed but their results  
 discarded, or that the steps simply would not be performed at all?

Not performed at all. See above.

 I'd think the first, but just want to make sure...
 
 Are the break possibilities currently considered only at the level of  
 the table body --so the element list contains the elements for the  
 cells' boxes, but no separate elements/indicators of row-boundaries?

We seem to have a different word set for expressing this. I don't think
we can say that the breaks are considered at table body level. And you
have to be careful about with element list you speak: the individual
cell element lists or the effective combined element list. Let me
explain how this is implemented:

The TableRowIterator simply provides effective rows with grid units.
For TableContentLM it chooses an array of effective rows which forms a
row group so that no column-spanned cell is split between groups. See
the Wiki for details. Such a row group is the minimal work item for
combining element lists. There is always a break possibility before and
after a row group (except if there is a keep constraint on a row, for
example). Inside a row group the break possibilities are determined by
the getNextStep() method where the combined element list is created.

 In that case --with the risk of underestimating the complexity of what  
 I propose--, perhaps an alternative to the suggested rule would be to  
 insert a step that combines the generated boxes/penalties only after  
 the element list for the last grid unit in a logical row has been  
 created (?) Anyway, instead of simply ignoring those steps, we could  
 also increase the penalty value for the offending break possibility  
 (currently: p=0 for all of them)
 So, IOW, for each row, store the element lists, and after all lists are  
 available, review the calculated penalty values, increasing them when a  
 given break possibility has undesirable consequences when the other  
 element lists for the row are taken into account.
 Or the other way around: give them a default penalty value that is high  
 enough, then afterwards decreasing them for the most 

Re: Element list generation for tables (special case)

2005-07-28 Thread Andreas L Delmelle

On Jul 28, 2005, at 10:10, Jeremias Maerki wrote:


On 27.07.2005 23:26:48 Andreas L Delmelle wrote:

Indeed, doesn't look right. Given the value for the orphans property,
one still would reasonably expect the break to occur before the first
cell of the second row.


...or after the first 3 lines of the second row.


Yes, but IIC, there isn't enough space left on the page to display 
those, hence 'break before the row'.


snip /

These are all valid possibilities, but as a I hinted I want to discuss
this at conceptual level not implementation level. I want to know if we
can have a general rule that we don't allow breaks before every cell
contributed at least one box to the combined element list. Also, Simon
and you are talking about providing higher penalty values, but I asked
about allowing a break at all (i.e. INFINITE penalty, or rather no
penalty at all, only a box). Considering a penalty value pINFINITE
requires a decision that such breaks are possible/desirable in the 
first

place.


Well, if it is only the conceptual side that matters ATM, I don't see 
any immediate problem with such a rule.


snip /

Right, but is that rule ok or not from a conceptual view. Are there any
cases where it might be bad?


Definitely OK for me. I can't seem to imagine a situation where this 
rule might cause undesirable effects. On the contrary, it reminds me of 
a question Luca raised some time ago when implementing lists, roughly : 
Is it desirable that, due to a page-break, label and body end up on 
different pages? I didn't think it was, and I don't think this should 
happen in case of tables either. The different columns in a row should 
always be considered together, not one-by-one. So the first column can 
never be allowed to end up in its entirety on a different page than the 
second.


Where it comes to rowspans:
In my modified example, if you move all the text in the middle column 
to the first row and make that cell span two rows, things get a bit 
awkward without the proposed rule anyway. (see attached PDF: the middle 
cell doesn't span the two rows, and the 'second' row only has two 
cells, and they're considered to be the first and second cell --while 
they should actually be the first and third)




table-body4c.xml.head.pdf
Description: Adobe PDF document





table-body4c.xml
Description: application/text




Greetz,

AD


PGP.sig
Description: This is a digitally signed message part


Re: Element list generation for tables (special case)

2005-07-28 Thread Jeremias Maerki

On 28.07.2005 13:42:08 Andreas L Delmelle wrote:
 On Jul 28, 2005, at 10:10, Jeremias Maerki wrote:
 
  On 27.07.2005 23:26:48 Andreas L Delmelle wrote:
  Indeed, doesn't look right. Given the value for the orphans property,
  one still would reasonably expect the break to occur before the first
  cell of the second row.
 
  ...or after the first 3 lines of the second row.
 
 Yes, but IIC, there isn't enough space left on the page to display 
 those, hence 'break before the row'.

Right. I just wanted to point out all the relevant break possibilities.

 snip /
  These are all valid possibilities, but as a I hinted I want to discuss
  this at conceptual level not implementation level. I want to know if we
  can have a general rule that we don't allow breaks before every cell
  contributed at least one box to the combined element list. Also, Simon
  and you are talking about providing higher penalty values, but I asked
  about allowing a break at all (i.e. INFINITE penalty, or rather no
  penalty at all, only a box). Considering a penalty value pINFINITE
  requires a decision that such breaks are possible/desirable in the 
  first
  place.
 
 Well, if it is only the conceptual side that matters ATM, I don't see 
 any immediate problem with such a rule.

Ok.

 snip /
  Right, but is that rule ok or not from a conceptual view. Are there any
  cases where it might be bad?
 
 Definitely OK for me. I can't seem to imagine a situation where this 
 rule might cause undesirable effects. On the contrary, it reminds me of 
 a question Luca raised some time ago when implementing lists, roughly : 
 Is it desirable that, due to a page-break, label and body end up on 
 different pages?

D'oh. I missed that. Definitely the same problem.

 I didn't think it was, and I don't think this should 
 happen in case of tables either. The different columns in a row should 
 always be considered together, not one-by-one. So the first column can 
 never be allowed to end up in its entirety on a different page than the 
 second.
 
 Where it comes to rowspans:
 In my modified example, if you move all the text in the middle column 
 to the first row and make that cell span two rows, things get a bit 
 awkward without the proposed rule anyway. (see attached PDF: the middle 
 cell doesn't span the two rows, and the 'second' row only has two 
 cells, and they're considered to be the first and second cell --while 
 they should actually be the first and third)

Ouch. You definitely hit a bug here. The height calculation rule should
have placed the two Bs right under the As (i.e. first row height =
8pt).


Jeremias Maerki



Re: Element list generation for tables (special case)

2005-07-28 Thread Chris Bowditch
I've been following this thread with interest. From a conceptual point 
of view, I agree with Andreas. I can't see any situation where you might 
want to have cells of the same row group on separate pages. Regardless 
of how many rows a particular cell spans.


Is there nothing in the spec to give you some clues?

snip what=very good discussion/

Chris



Element list generation for tables (special case)

2005-07-27 Thread Jeremias Maerki
I got a test case for tables which raises not a technical but rather a
interesting conceptual question. Please have a look at the attached test
case. It defines a table with two columns and two rows. In the given
setup the second row creates an break decision with the current code that
can be argued as being bad (see the PDF). Here's an excerpt from the
element list:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=0 //-- this is where the second row starts
13) penalty p=0 w=9600  //this penalty is due to the possible break after B
14) box w=28800
15) penalty p=0 w=0 //this is the next break poss after three lines
//due to the orphan setting
16) box w=28800

While working on element list generation for tables I came across this
question and decided not to do anything about it, especially since
removing some of these break possibilities might not be desirable in all
cases.

A rule that could be easily implemented would be that we allow the first
break possibility only after every cell in a new row contributed at
least one of its own boxes to the combined element list.

An example: If you look at page 1 of [1], step 1 would over ignored. On
page 3 of [1], the steps 1 and 2 would be ignored.

[1] http://people.apache.org/~jeremias/fop/KnuthBoxesForTablesWithBorders.pdf

With this rule the element list would look like this:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=28800 //-- this is where the second row starts
13) penalty p=0 w=0
14) box w=28800

I'm unsure ATM what this would mean for cases with row spanning, though.

I can see that this new rule would make this better in most cases. What
worries me is that there might be cases where we wouldn't want that
behaviour, although ATM I can't see them. So I just want to check with
you that I haven't forgotten about anything. Or maybe someone has a
better rule to implement this. Thoughts welcome.


Jeremias Maerki


table-body4.xml
Description: Binary data


table-body4.xml.head.pdf
Description: Binary data


Re: Element list generation for tables (special case)

2005-07-27 Thread Simon Pepping
One thing that IMHO is still lacking in the table breaking code is
penalty values. ATM all penalties are 0. I believe the penalty value
should depend on the extra vertical size that the break contributes,
that is, on the penalty's width. I have no idea about the
multiplication constant, nor if it should be linear or quadratic. I am
not sure if it avoids the current case, but it is surely needed in
order to favour better breaks over worse ones.

Simon

On Wed, Jul 27, 2005 at 08:45:48PM +0200, Jeremias Maerki wrote:
 I got a test case for tables which raises not a technical but rather a
 interesting conceptual question. Please have a look at the attached test
 case. It defines a table with two columns and two rows. In the given
 setup the second row creates an break decision with the current code that
 can be argued as being bad (see the PDF). Here's an excerpt from the
 element list:
 
  8) box w=9600
  9) penalty p=0 w=0
 10) box w=28800
 11) penalty p=0 w=0
 12) box w=0 //-- this is where the second row starts
 13) penalty p=0 w=9600  //this penalty is due to the possible break after B
 14) box w=28800
 15) penalty p=0 w=0 //this is the next break poss after three lines
 //due to the orphan setting
 16) box w=28800
 
 While working on element list generation for tables I came across this
 question and decided not to do anything about it, especially since
 removing some of these break possibilities might not be desirable in all
 cases.
 
 A rule that could be easily implemented would be that we allow the first
 break possibility only after every cell in a new row contributed at
 least one of its own boxes to the combined element list.
 
 An example: If you look at page 1 of [1], step 1 would over ignored. On
 page 3 of [1], the steps 1 and 2 would be ignored.
 
 [1] http://people.apache.org/~jeremias/fop/KnuthBoxesForTablesWithBorders.pdf
 
 With this rule the element list would look like this:
 
  8) box w=9600
  9) penalty p=0 w=0
 10) box w=28800
 11) penalty p=0 w=0
 12) box w=28800 //-- this is where the second row starts
 13) penalty p=0 w=0
 14) box w=28800
 
 I'm unsure ATM what this would mean for cases with row spanning, though.
 
 I can see that this new rule would make this better in most cases. What
 worries me is that there might be cases where we wouldn't want that
 behaviour, although ATM I can't see them. So I just want to check with
 you that I haven't forgotten about anything. Or maybe someone has a
 better rule to implement this. Thoughts welcome.
 
 
 Jeremias Maerki




-- 
Simon Pepping
home page: http://www.leverkruid.nl



Re: Element list generation for tables (special case)

2005-07-27 Thread Andreas L Delmelle

On Jul 27, 2005, at 20:45, Jeremias Maerki wrote:

Hi,


I got a test case for tables which raises not a technical but rather a
interesting conceptual question. Please have a look at the attached  
test

case. It defines a table with two columns and two rows. In the given
setup the second row creates an break decision with the current code  
that

can be argued as being bad (see the PDF).


Indeed, doesn't look right. Given the value for the orphans property,  
one still would reasonably expect the break to occur before the first  
cell of the second row.


BTW: tried adding a third column mirroring the first, and this leads to  
ONLY the second column being moved to the next page... This as a  
further demonstration that the currently produced result still leaves a  
bit to be desired. (see attach)



Here's an excerpt from the element list:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=0 //-- this is where the second row starts
13) penalty p=0 w=9600  //this penalty is due to the possible break  
after B

14) box w=28800
15) penalty p=0 w=0 //this is the next break poss after three lines
//due to the orphan setting
16) box w=28800

While working on element list generation for tables I came across this
question and decided not to do anything about it, especially since
removing some of these break possibilities might not be desirable in  
all

cases.

A rule that could be easily implemented would be that we allow the  
first

break possibility only after every cell in a new row contributed at
least one of its own boxes to the combined element list.


So IOW, if I get this correctly: all break possibilities are to be  
considered preliminary until the last cell occupying this row (= last  
grid-unit in the row) has been taken into account?



An example: If you look at page 1 of [1], step 1 would over ignored. On
page 3 of [1], the steps 1 and 2 would be ignored.
[1]  
http://people.apache.org/~jeremias/fop/ 
KnuthBoxesForTablesWithBorders.pdf


Hmm... Do you mean that the steps would be performed but their results  
discarded, or that the steps simply would not be performed at all?

I'd think the first, but just want to make sure...

Are the break possibilities currently considered only at the level of  
the table body --so the element list contains the elements for the  
cells' boxes, but no separate elements/indicators of row-boundaries?


In that case --with the risk of underestimating the complexity of what  
I propose--, perhaps an alternative to the suggested rule would be to  
insert a step that combines the generated boxes/penalties only after  
the element list for the last grid unit in a logical row has been  
created (?) Anyway, instead of simply ignoring those steps, we could  
also increase the penalty value for the offending break possibility  
(currently: p=0 for all of them)
So, IOW, for each row, store the element lists, and after all lists are  
available, review the calculated penalty values, increasing them when a  
given break possibility has undesirable consequences when the other  
element lists for the row are taken into account.
Or the other way around: give them a default penalty value that is high  
enough, then afterwards decreasing them for the most favorable break  
possibilities.

Or modify all boxes' widths (=heights) to be equal to the largest box.
After this step is completed, add the combined element list to the body.

IIC, the two separate element lists for the second row would be:

First grid unit:
1) box w=9600
2) penalty p=0 w=0

Second grid unit:
1) box w=28800
2) penalty p=0 w=0

So, compare the first boxes' widths and, since the first box in the  
first list is smaller than that in the second list, either increase the  
penalty value for the second step in the first list, or change the  
width of the first box in the first list. Maybe the latter is more  
attractive, since the resulting combined list can then be created by  
concatenating the two separate lists...


[Admitted: this particular case is rather simple, since both lists only  
have one box.]


Then combine the lists to arrive at the result below:


With this rule the element list would look like this:

snip /


12) box w=28800 //-- this is where the second row starts
13) penalty p=0 w=0
14) box w=28800

I'm unsure ATM what this would mean for cases with row spanning,  
though.


As long as the criterion is that every _grid unit_ for the (logical)  
row in question must have contributed at least one box, I wouldn't  
expect any particular problem.



I can see that this new rule would make this better in most cases. What
worries me is that there might be cases where we wouldn't want that
behaviour, although ATM I can't see them. So I just want to check with
you that I haven't forgotten about anything. Or maybe someone has a
better rule to implement this. Thoughts welcome.



Greetz,

AD



Re: Element list generation for tables (special case)

2005-07-27 Thread Andreas L Delmelle


Sorry, forgot the attachment...



table-body4b.xml.head.pdf
Description: Adobe PDF document


On Jul 27, 2005, at 23:26, Andreas L Delmelle wrote:


On Jul 27, 2005, at 20:45, Jeremias Maerki wrote:

Hi,


I got a test case for tables which raises not a technical but rather a
interesting conceptual question. Please have a look at the attached  
test

case. It defines a table with two columns and two rows. In the given
setup the second row creates an break decision with the current code  
that

can be argued as being bad (see the PDF).


Indeed, doesn't look right. Given the value for the orphans property,  
one still would reasonably expect the break to occur before the first  
cell of the second row.


BTW: tried adding a third column mirroring the first, and this leads  
to ONLY the second column being moved to the next page... This as a  
further demonstration that the currently produced result still leaves  
a bit to be desired. (see attach)



Here's an excerpt from the element list:

 8) box w=9600
 9) penalty p=0 w=0
10) box w=28800
11) penalty p=0 w=0
12) box w=0 //-- this is where the second row starts
13) penalty p=0 w=9600  //this penalty is due to the possible break  
after B

14) box w=28800
15) penalty p=0 w=0 //this is the next break poss after three  
lines

//due to the orphan setting
16) box w=28800

While working on element list generation for tables I came across this
question and decided not to do anything about it, especially since
removing some of these break possibilities might not be desirable in  
all

cases.

A rule that could be easily implemented would be that we allow the  
first

break possibility only after every cell in a new row contributed at
least one of its own boxes to the combined element list.


So IOW, if I get this correctly: all break possibilities are to be  
considered preliminary until the last cell occupying this row (= last  
grid-unit in the row) has been taken into account?


An example: If you look at page 1 of [1], step 1 would over ignored.  
On

page 3 of [1], the steps 1 and 2 would be ignored.
[1]  
http://people.apache.org/~jeremias/fop/ 
KnuthBoxesForTablesWithBorders.pdf


Hmm... Do you mean that the steps would be performed but their results  
discarded, or that the steps simply would not be performed at all?

I'd think the first, but just want to make sure...

Are the break possibilities currently considered only at the level of  
the table body --so the element list contains the elements for the  
cells' boxes, but no separate elements/indicators of row-boundaries?


In that case --with the risk of underestimating the complexity of what  
I propose--, perhaps an alternative to the suggested rule would be to  
insert a step that combines the generated boxes/penalties only after  
the element list for the last grid unit in a logical row has been  
created (?) Anyway, instead of simply ignoring those steps, we could  
also increase the penalty value for the offending break possibility  
(currently: p=0 for all of them)
So, IOW, for each row, store the element lists, and after all lists  
are available, review the calculated penalty values, increasing them  
when a given break possibility has undesirable consequences when the  
other element lists for the row are taken into account.
Or the other way around: give them a default penalty value that is  
high enough, then afterwards decreasing them for the most favorable  
break possibilities.

Or modify all boxes' widths (=heights) to be equal to the largest box.
After this step is completed, add the combined element list to the  
body.


IIC, the two separate element lists for the second row would be:

First grid unit:
1) box w=9600
2) penalty p=0 w=0

Second grid unit:
1) box w=28800
2) penalty p=0 w=0

So, compare the first boxes' widths and, since the first box in the  
first list is smaller than that in the second list, either increase  
the penalty value for the second step in the first list, or change the  
width of the first box in the first list. Maybe the latter is more  
attractive, since the resulting combined list can then be created by  
concatenating the two separate lists...


[Admitted: this particular case is rather simple, since both lists  
only have one box.]


Then combine the lists to arrive at the result below:


With this rule the element list would look like this:

snip /


12) box w=28800 //-- this is where the second row starts
13) penalty p=0 w=0
14) box w=28800

I'm unsure ATM what this would mean for cases with row spanning,  
though.


As long as the criterion is that every _grid unit_ for the (logical)  
row in question must have contributed at least one box, I wouldn't  
expect any particular problem.


I can see that this new rule would make this better in most cases.  
What

worries me is that there might be cases where we wouldn't want that
behaviour, although ATM I can't see them. So I just want to check with
you