On Wed, Nov 09, 2005 at 09:09:55AM +0800, Manuel Mall wrote: > On Wed, 9 Nov 2005 12:47 am, Andreas L Delmelle wrote: > > >> 6. Breaking / elastic / non removable - eg. U+3000 Ideographic > > >> space => Must handle border/padding > > >> => Must handle text-align > > >> Question: XSL-FO does not define U+3000 as removable white space > > >> but would under common CJK typesetting conventions this be removed > > >> at a line break? > > > > I think so. That's precisely what the definition for the "auto" value > > of suppress-at-line-break warns about. Does this mean that the use of > > a fo:character is mandated if the user wants it removed? Yes, IMO. > > > > Unless the editors can be persuaded to make U+3000 an exception to > > the default "retain", like common spaces (U+0020), compliance means > > treating this character maybe a bit counter-intuitively. > > > > >> 7. Breaking / elastic / removable - eg. U+0020 Space > > >> => Can occur in runs which must be wholly removed > > >> => Must handle border/padding > > >> => Must handle text-align > > >> Any combinations I have missed, e.g. is there a "break / non > > >> elastic / remove at break" case? > > > > > > Maybe the fixed width spaces? > > > > More generally: any fixed-width character, added through a > > fo:character, implying a feasible/favorable break before or after, > > and having suppress-at-line-break="suppress". > > > > I could put: > > > > <fo:character character="a" suppress-at-line-break="suppress" /> > > > > in a document, surrounded by non-collapsible whitespace, and the > > formatter may decide to break before/after and drop the 'a'. > > > > Fixed-width spaces could be viewed as a subset. If they aren't added > > via a fo:character, they would belong to category 'break - non- > > elastic - non-removable'. (speaking strictly XSL-FO) > > > Andreas, I tend to disagree with the basic sentiment express here. If we > accept Simon's notion that white space handling in XSL-FO is about > dealing with spaces and linefeeds introduced by editors or humans for > XML readability purposes then dealing with typographic conventions of > particular scripts has nothing to do with the rules of white space > handling. XSL-FO in quite a few places mentions user agent flexibility > when it comes to dealing with script / language / country specific > items. If we can, as Joerg suggests, replace a base letter followed by > a combining diacritical mark with a matching combined glyph, why can't > we replace an ideographic space followed by a line break with simply > the line break? The point being I am not suggesting to remove the > ideographic space under the XSL-FO white space rules but under 'script > specific typographic conventions'. And I believe there is nothing in > the spec which prohibits this - quite the opposite actually - the spec > IMO encourages 'intelligent' handling of 'local customs'. Of course I > don't know what the CJK typographic conventions are so this is all a > bit hypothetical.
White space handling as dealing with spaces and linefeeds is entirely covered by the properties linefeed-treatment and white-space-collapse. The rewritten property white-space-treatment (rewritten to the extent that its name and those of its values are no longer a good indication of their meaning) covers handling of a set of characters around linebreaks. I agree with Andreas that one would have to write <fo:character character=" " suppress-at-line-break="suppress" /> to get it suppressed. Quite awkward. Whether it is a good idea to do the same as intelligent handling of local customs I do not know. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
