On Feb 6, 2006, at 17:04, Luca Furini wrote:

Hi Manuel / Luca,

Manuel Mall wrote:

IMO yes there can be a break and no only the space needs to be removed. Again the argument is that nbsp is not whitespace as per XSL-FO definition and need not to be removed.

What makes you think that both the nbsp and the space needs to be removed around a fop generated linebreak?

Oops, I forgot to add an importand condition: if the user explicitly states that the nsbp must be discarded around a line break:
  <fo:inline suppress-at-line-break="suppress">&nbsp;</fo:inline>

Oops, typo? suppress-at-line-break is a non-inherited property, only applicable to fo:character :-)

Well, the more I look at this, the more it seems unlikely to ever happen ... we are probably having a highly theoretical disquisition! :-)

<fo:character character="&#xA0;" suppress-at-line-break="suppress" />

followed by a space is indeed very theoretical.

So is (another alternative):

<fo:inline suppress-at-line-break="suppress">
  <fo:character character="&#xA0;"
                suppress-at-line-break="inherit" /> </fo:inline>

OTOH, if we can make the algorithm work in these exotic cases, then the commonly used scenarios will be a cake-walk. :-)

This does, in any case, shed some different light on the notion of 'pretty printing whitespace', since currently --at least that was my understanding of the discussions, and that's what I worked towards-- a fo:character is considered the same as a regular character, in that fo:characters representing XML whitespace are subject to whitespace- removal... Yet, one can arguably defend the idea that any *fo:*character is inserted for *XML* pretty printing purposes, no? Should this change be reverted then?
[Maybe partly, because suppose:

<fo:block>
  <fo:character character="&#x20;" suppress-at-line-break="retain" />
...

Currently, the fact that it is a fo:character is not known when running this through the algorithm. The CharIterators deal with the characters. The XMLWhiteSpaceHandler makes a decision based purely on the value of the character property. It is agnostic to the suppress- at-line-break property's value... I myself would tend to use a non- breaking space in this case, since it escapes the whitespace handling, but it is a theoretical possibility. :-)

Another alternative would be to introduce a member to the CharIterators...
Something like isSuppressible(), which would return true if:
( the current element is a regular character
  and it has codepoint U+0020 )
or ( the current element is a fo:character
  and
  (( the value of its character property is codepoint U+0020
    and suppress-at-line-break="auto" )
  or ( suppress-at-line-break="suppress" ))

As such, refinement (white-space)-character-removal could operate on this basis, and already resolve such issues at that stage.

The current approach is still not 100% correct anyway...]

Anyway, I was still not sure whether there could be a break so I looked back at the Unicode Annex #14.
<snip />
So, it seems there could be a break between SPACE and NBSP (with NBSP starting the next line), but not between NBSP and SPACE. Can we say this is settled?

Yes! Definitely. We're looking for UAX#14 'compliance' as well here.

My 2 cents.

Cheers,

Andreas

Reply via email to