On Thu, 3 Nov 2005 05:56 pm, Manuel Mall wrote:
> On Wed, 2 Nov 2005 11:58 pm, Luca Furini wrote:
> > Manuel Mall wrote:
<snip/
> >
> > If we have two (or more) spaces, we could use the sequence:
> >
> > 1  glue w=endB&P
> > 2  penalty w=0
> > 3  glue w=(- endB&P - startB&P)
> > 4  glue w=spaceIPD1
> > 5  glue w=spaceIPD2
> > 6  box w=0
> > 7  infinite penalty
> > 8  glue w=startB&P
> >
> > total width = spaceIPD1 + spaceIPD2
> > if break at #2 = endB&P / startB&P
> >
> > Glues #4 and #5 have a Position pointing to different AreaInfo
> > objects (from different LMs). This should solve (?) the case of
> > ignore-if-surrounding.
>
> Excellent, because ignore-if-surrounding is the only case we have to
> consider. For formatter generated line breaks this is the same as
> ignore-if-after... and ignore-if-before... because we control the
> position of the line break we can logically position it such that for
> the before and after cases we can remove the spaces. Therefore IMO we
> don't need any other Knuth sequences.
>
> However, as these are "integrated sequences" we still have to carry
> info about this between LMs. This is "for further study" and
> suggestions are welcome.
>
Luca, as you are the expert on the Knuth sequences with respect to 
break/space handling I think it would be good if we could document all 
the different cases we have so far and envisage in the near future. 
Here are some of the combinations I have identified:

1. Non breaking / non elastic space => probably just a normal character, 
i.e. part of a word.

2. Non breaking / elastic space - eg. U+00A0 Non breaking space
        => Must prevent break
        => Must handle text-align

3. Break / non elastic - eg. U+200B ZWSP, any other break between two 
characters not involving adding or removing space/characters
        => Must handle border/padding
        => Must handle text-align

4. Break / non elastic / remove if not break - eg. U+00AD Soft hyphen
        => Must remove if not at break
        => Must handle border/padding
        => Must handle text-align

5. Break / non elastic / add character if break - eg. hyphenation
        => Must add space for hyphen if at break
        => Must handle border/padding
        => Must handle text-align

6. Breaking / elastic / non removable - eg. U+3000 Ideographic space
        => Must handle border/padding
        => Must handle text-align
        Question: XSL-FO does not define U+3000 as removable white space but 
would under common CJK typesetting conventions this be removed at a 
line break?

7. Breaking / elastic / removable - eg. U+0020 Space
        => Can occur in runs which must be wholly removed
        => Must handle border/padding
        => Must handle text-align

Any combinations I have missed, e.g. is there a "break / non elastic / 
remove at break" case?

<snip/>

Regards

Manuel

Reply via email to