In LineBreakTest.txt, there are test cases that indicate there should *not* be a break after U+0308, however, the LB rule cited does not appear to apply and it would appear that there *should* be a break. For example:
× 000A ÷ 0308 × 23E9 ÷ # × [0.3] <LINE FEED (LF)> (LF_NotEastAsian) ÷ [5.03] COMBINING DIAERESIS (CM1_NotEastAsian_CM) × [28.0] BLACK RIGHT-POINTING DOUBLE TRIANGLE (AL) ÷ [0.3] LB28 states "Do not break between alphabetics (“at”)" with the following break rule: (AL | HL) × (AL | HL) However, in the aforementioned test case, neither U+000A nor U+0308 has break class AL or HL (they have break class LF and CM). Yet rule 28.0 is cited as the reason for not breaking between U+0308 and U+23E9. It would appear that there _should_ be a break here. Likewise, for the test: × 200B ÷ 0308 × 0024 ÷ # × [0.3] ZERO WIDTH SPACE (ZW_NotEastAsian) ÷ [8.0] COMBINING DIAERESIS (CM1_NotEastAsian_CM) × [24.03] DOLLAR SIGN (PR_NotEastAsian) ÷ [0.3] LB24 states "Do not break between alphabetics (“at”)" with the following break rule: (PR | PO) × (AL | HL) (AL | HL) × (PR | PO) However, neither U+200B nor U+0308 has break class PR, PO, AL, or HL (they have break class ZW and CM). Yet rule 24.03 is cited as the reason for not breaking between U+0308 and U+0024. It would appear that there _should_ be a break here. In total, I have collected ~80 test cases from LineBreakTest.txt that exhibit this same pattern. I'm wondering if these test cases were meant to have a hyphen character because then they'd respect rule LB20a which states "Do not break after a word-initial hyphen". This rule has the definition: ( sot | BK | CR | LF | NL | SP | ZW | CB | GL ) ( HY | [\u2010] ) × AL So, for example, test case: × 000A ÷ 0308 × 23E9 ÷ # LF ÷ CM × AL (incorrect?) would become: × 000A ÷ 0308 ÷ 002D × 23E9 ÷ # LF ÷ CM ÷ HY × AL (correct)
