On Thu, 3 Nov 2005 06:03 am, J.Pietschmann wrote:
> Manuel Mall wrote:
> > a) Yes UAX#14 always breaks at the of a sequence of spaces
> > b) But is also says that it assumes any trailing spaces in a line
> > are being removed
> > This "conflicts" with XSL-FO which can force spaces being retained
> > therefore adjustments to the algorithm are necessary to cater for
> > that.
>
> Computing line breaking opportunities and discarding whitespace at
> the end (or beginning) of a line are different matters. If whitespace
> has to be retained, trailing spaces after a non-space string may
> simply mean the previous line breaking opportunity has to be used,
> because otherwise the string including the trailing spaces will
> overflow the line area. The trailing whitespace may also influence
> text justification.
>
Hmm, to me it appears that UNICODE and XSL-FO have slightly different 
models when it comes to white space in the context of line breaking 
which is causing the discussion here. In UNICODE everything is based 
simply on the properties of the codepoint in question and its 
neighbour. In XSL-FO one can change the behaviour of a codepoint by 
setting those white space related XSL-FO properties. That is not a 
concept within UNICODE. If you want to retain white space in UNICODE 
you use a different codepoint. If you want to retain a space in XSL-FO 
you could use a different codepoint but more likely you set a XSL-FO 
property if you want this applied widely in your document.

If we want to 'marry' UNICODE linebreaking with XSL-FO white space 
handling we have this interaction to consider. One possible solution 
would be to replace spaces (U+0020) by different codepoints which 
resemble the behaviour modification imposed by any XSL-FO white space 
handling properties in effect. But I am not sure if this can be done in 
all cases. Otherwise we may have to modify the UNICODE line breaking 
algorithm to cater for the XSL-FO white space specialities.

> J.Pietschmann

Manuel

Reply via email to