On 27.02.2006 14:59:31 Manuel Mall wrote:
> On Monday 27 February 2006 21:33, Jeremias Maerki wrote:
> > On 27.02.2006 12:36:58 Manuel Mall wrote:
> > > On Monday 27 February 2006 18:55, Jeremias Maerki wrote:
> > > > What's the status of UAX#14? Does anybody have had time to work
> > > > on that, yet? I'm asking because I'm considering hacking in
> > > > support for the fixed width spaces (U+2000..U+200A). One of my
> > > > clients asks for that but I can't allocate enough time right now
> > > > to do the whole thing, unfortunately.
> > >
> > > I don't think UAX#14 will happen in a hurry.  However in
> > > http://wiki.apache.org/xmlgraphics-fop/LineBreaking I do describe
> > > possible handling of fixed width spaces. The main decision, and
> > > that has little to do with UAX#14 is if these spaces are to be
> > > treated like white space when it comes to linebreaks or like
> > > non-breakable spaces. If one follows the XSL-FO spec to the letter
> > > these spaces are not white space and therefore are not removed
> > > around a line break. I have no idea what actual user expectations
> > > are when it comes to these spaces. Would authors (especially in non
> > > english / latin languages) expect these spaces to be removed around
> > > a linebreak or not? The relevant Knuth sequences which need to be
> > > generated depend on that decision: Is the space removable or not
> > > when a break occurs?
> >
> > I think we're talking about two different removals here, right? Once
> > it's about the FO white-space-affecting properties. Here's where I
> > think that these do not affect special Unicode spaces (only XML white
> > space, see below). When we're talking about line-breaking I think the
> > space that makes up the break possibility is removed (except in the
> > case of tagged PDF where the space will need to be preserved for the
> > structure info) but not any of the other "special" spaces in the
> > vicinity. At least, that would be my expectation and my
> > interpretation.
> >
> 
> Removal of spaces around formatter line breaks is also covered by the 
> spec. The property suppress-at-line-break controls it. And check its 
> definition of "auto". The fixed width spaces are explicitly excluded. 
> So, contrary to my initial post there is no ambiguity in the spec. 
> Fixed width spaces are not removed unless the user explicitly sets the 
> suppress-at-line-break property. As we do not yet support the 
> suppress-at-line-break property the only Knuth sequences which need to 
> be generated are for non-elastic, non-removable spaces. That should be 
> reasonably straight forward.
> 
> Interestingly enough this means the default behaviour of 
> suppress-at-line-break is that independent of any other white space 
> handling properties U+0020 (space) is always(!) removed around 
> formatter generated line breaks. Need to think about that a bit more.

Wait a sec! suppress-at-line-break only applies to fo:character not to
general text content!!! I think it is less complicated than you think
right now.

> > I've just gone through the FO spec again searching for "white" and it
> > seems clear to me that the spec makes a rather clear distinction when
> > white-space in terms of the XML spec is meant or when general
> > white-space is meant.
> >
> > > I am also uncertain how these spaces interact with line
> > > justification. They are by definition not elastic. So if you have a
> > > fixed width space only between two words this is not an inter word
> > > gap that can be used for justification.
> >
> > Yes.
> >
> > > Therefore any calculations which rely on knowing the
> > > number of words on a line to determine how many inter word gaps we
> > > have to then calculate the per gap justification amount will need
> > > to be adjusted to not count inter word gaps which only contain
> > > fixed width spaces. On the other hand they are still word
> > > boundaries for the purpose of finding words for hyphenation.
> >
> > Yes.
> >
> > But is there really a problem when it comes to adjusting inter-word
> > gaps because that's already handled by the right element list for all
> > the different cases, right? At least, I don't see where exactly
> > you're uncertain. The fixed width spaces just don't have any
> > stretch/shrink they contribute to inter-word gaps.
> >
> 
> Yes, the Knuth algorithm will take the stretch/shrink into account when 
> doing its optimal line breaking but it will not tell you what the final 
> inter word gap is. That is I think separately computed based on the 
> number of words found with some fine tuning. This is where you may (or 
> may not) run into trouble.

Ah, ok. Thanks for the clarification.

> > I'll look into the fixed width spaces. So, thanks for your fast
> > answer and the valuable pointer to the Wiki. In case I don't manage
> > to do this cleanly, the least I can do is make sure we don't get ugly
> > "#" in the output because the renderers don't know about the special
> > spaces. This will also help for when someone has time to go towards
> > UAX#14.
> >
> > Jeremias Maerki
> 
> Manuel



Jeremias Maerki

Reply via email to