On 27.02.2006 12:36:58 Manuel Mall wrote:
> On Monday 27 February 2006 18:55, Jeremias Maerki wrote:
> > What's the status of UAX#14? Does anybody have had time to work on
> > that, yet? I'm asking because I'm considering hacking in support for
> > the fixed width spaces (U+2000..U+200A). One of my clients asks for
> > that but I can't allocate enough time right now to do the whole
> > thing, unfortunately.
> I don't think UAX#14 will happen in a hurry. However in
> http://wiki.apache.org/xmlgraphics-fop/LineBreaking I do describe
> possible handling of fixed width spaces. The main decision, and that
> has little to do with UAX#14 is if these spaces are to be treated like
> white space when it comes to linebreaks or like non-breakable spaces.
> If one follows the XSL-FO spec to the letter these spaces are not white
> space and therefore are not removed around a line break. I have no idea
> what actual user expectations are when it comes to these spaces. Would
> authors (especially in non english / latin languages) expect these
> spaces to be removed around a linebreak or not? The relevant Knuth
> sequences which need to be generated depend on that decision: Is the
> space removable or not when a break occurs?
I think we're talking about two different removals here, right? Once
it's about the FO white-space-affecting properties. Here's where I think
that these do not affect special Unicode spaces (only XML white space, see
below). When we're talking about line-breaking I think the space that
makes up the break possibility is removed (except in the case of tagged
PDF where the space will need to be preserved for the structure info)
but not any of the other "special" spaces in the vicinity. At least,
that would be my expectation and my interpretation.
I've just gone through the FO spec again searching for "white" and it
seems clear to me that the spec makes a rather clear distinction when
white-space in terms of the XML spec is meant or when general
white-space is meant.
> I am also uncertain how these spaces interact with line justification.
> They are by definition not elastic. So if you have a fixed width space
> only between two words this is not an inter word gap that can be used
> for justification.
> Therefore any calculations which rely on knowing the
> number of words on a line to determine how many inter word gaps we have
> to then calculate the per gap justification amount will need to be
> adjusted to not count inter word gaps which only contain fixed width
> spaces. On the other hand they are still word boundaries for the
> purpose of finding words for hyphenation.
But is there really a problem when it comes to adjusting inter-word gaps
because that's already handled by the right element list for all the
different cases, right? At least, I don't see where exactly you're
uncertain. The fixed width spaces just don't have any stretch/shrink
they contribute to inter-word gaps.
I'll look into the fixed width spaces. So, thanks for your fast answer
and the valuable pointer to the Wiki. In case I don't manage to do this
cleanly, the least I can do is make sure we don't get ugly "#" in the
output because the renderers don't know about the special spaces. This
will also help for when someone has time to go towards UAX#14.