Manuel Mall wrote:
Hmm, to me it appears that UNICODE and XSL-FO have slightly different models when it comes to white space in the context of line breaking which is causing the discussion here.
I don't think so. The overlap between UAX14 and XSLFO is that both mandate a line break for each LF which survived the character level refinement stage. UAX14 is all about where an application might place a line break, and where it shouldn't. The notice about "space at the end of a line is usually discarded" is just a notice. There is absolutely nothing in the record on how sequences of spaces should be handled. XSLFO on the other hand doesn't specify any mechanism for finding line breaking opportunities. It just says that a LF which is treated as a LF should cause a line break, and leaves finding other positions to the implementation. As an example lets take to following FO snipped (spaces denoted by underlines for visibility): <fo:block>A_nice_word.</fo:block> Provided all properties are at their default value, a processor which produces the following layout A nice word. may claim conformance to both UAX14 and XSLFO 1.0. If it produces A ni ce word. it may claim conformance to XSLFO but not UAX14 If it produces A_nice_ word. it may claim conformance to UAX14 but not XSLFO because of the trailing space in the first line.
If we want to 'marry' UNICODE linebreaking with XSL-FO white space handling we have this interaction to consider.
I still think that finding line break opportunities and handling white space are different things, and can be handled nearly independently. Note that white space removal around line breaks happend after a break opportunity has been actually "promoted" to a real line break. J.Pietschmann
