It appears that http://www.unicode.org/Public/8.0.0/ucd/auxiliary/LineBreakTest.txt is testing a tailoring rather than the default line break algorithm, contrary to its heading "# Default Line Break Test". And http://www.unicode.org/Public/UCD/latest/ucd/auxiliary/LineBreakTest.html follows along.

For example, the default algorithm as shown in http://www.unicode.org/reports/tr14/#Table2 follows LB25, which is an approximation of the desired behavior. But the test and html don't follow this. I suspect they are looking for the tailoring described in http://www.unicode.org/reports/tr14/#Examples example 7.

For example, the test file tests for, and the html says that a class CL code point followed by a class PO one is an unconditional line break opportunity, based on rule 999. (which is the same as LB31 in TR14)

Whereas, http://www.unicode.org/reports/tr14/#Table2 says that a class CL code point followed by a class PO one is an

"indirect break opportunity B % A is equivalent to B × A and B SP+ ÷ A; in other words, do not break before A, unless one or more spaces follow B." This is by LB25 and LB18.

There is a discrepancy here, which could be resolved either by changing the tests and html to follow LB25, or documenting that these are for something above and beyond the default algorithm. (There may also be other discrepancies that I haven't stumbled against)



Reply via email to