On Nov 9, 2005, at 02:09, Manuel Mall wrote:
Andreas, I tend to disagree with the basic sentiment express here.
accept Simon's notion that white space handling in XSL-FO is about
dealing with spaces and linefeeds introduced by editors or humans for
XML readability purposes then dealing with typographic conventions of
particular scripts has nothing to do with the rules of white space
We're (again) more in agreement than we realize, I think... Although,
now you got me wondering what you think is my 'basic sentiment' :-)
Indeed, dealing with typographic conventions has nothing to do with
white-space handling. Hence my earlier remark that, strictly speaking
--apart from any flexibility/liberty wrt localization issues-- the
ideographic space should not be suppressed, and that at least in
theory, this would make a fo:character required (to override the
default "retain"). Maybe removing them at the refinement stage would
be a bit too early (?) Come to think of it: is the standard linefeed
character also commonly used in CJK scripts, or do they use a
different kind of line-terminator?
XSL-FO in quite a few places mentions user agent flexibility
when it comes to dealing with script / language / country specific
items. If we can, as Joerg suggests, replace a base letter followed by
a combining diacritical mark with a matching combined glyph, why can't
we replace an ideographic space followed by a line break with simply
the line break?
The point being I am not suggesting to remove the
ideographic space under the XSL-FO white space rules but under 'script
specific typographic conventions'.
No objections here, just wanted it to be clear that it should then be
suppressed, as you mention, in the context of such 'typographic
conventions'. This would also have to made clear codewise, so clearly
marked as separated from the white-space removal due to the ws-
handling properties. Just so that future devs don't get lost in that
bit of the source...
And I believe there is nothing in the spec which prohibits this
- quite the opposite actually - the spec IMO encourages 'intelligent'
handling of 'local customs'. Of course I don't know what the CJK
typographic conventions are so this is all a bit hypothetical.
Well, I can't find the exact reference (may be one of the earlier
posts in this thread), but I seem to remember that the ideographic
space can only shrink, not expand. Following that, I would say that
there is little difference between "suppressing" a character, and
"shrinking it to zero width". Maybe, since it needs to be shrinkable
anyway, it could be treated along that line?
I know too little of possible alignments in tb-rl scripts to offer
any certainty here.
My other point was that suppress-at-line-break is not a property that
is restricted to whitespace characters. In theory, *any* character
could be suppressed at a line-break --if so specified by the user. In
any case, FOP should do what the user intended, and not skip that
step because we didn't think about implementing it because we were
convinced that it would never occur anyway (or worse: crash because
confronted with a situation that wasn't foreseen).