Dear Léane, Interestingly, rule LB12a and the Line_Break class of EN DASH have changed recently, although not in a way that affects the behaviour you describe: EN DASH used to be lb=BA, and LB12a did not mention HH (because this class did not exist). See L2/24-224 <https://www.unicode.org/L2/L2024/24224-utc181-properties-recs.pdf> Section 6.1 and UTC decision 181-C35 <https://www.unicode.org/L2/L2024/24221.htm#181-C53>.
Indeed, the changes were made in such a way to not alter the effect of rule LB12a. However, a look at the rationale for LB12a, namely > Allowing a break after BA <https://www.unicode.org/reports/tr14/#BA> or HY > <https://www.unicode.org/reports/tr14/#HY> matches widespread > implementation practice and supports a common way of handling special line > breaking of explicit hyphens, such as in Polish and Portuguese. shows two things: 1. I forgot to update this sentence to mention HH; 2. now that hyphens have moved to HH, there is no reason for BA to be in this rule. Now, removing BA from LB12a would not fix the problem on its own; but we could then move EN DASH back from HH to BA, and this should do the trick. I will bring a proposal for that to the Properties & Algorithms Group <https://www.unicode.org/consortium/props-algorithms.html>. Best regards, Robin Leroy Le lun. 30 mars 2026 à 21:03, Léane GRASSER via Unicode < [email protected]> a écrit : > Hi, > > Recently, while setting up typographic conventions for the French > newspaper I contribute to, I noticed an issue that affected en dashes > (U+2013), but not em dashes (U+2014). > > We collectively decided to use en dashes for parentheticals rather than em > dashes, because of their limited size, improved readability over > hyphen-minus, and Unicode "compliance". We also put a non-breaking space > (U+00A0) inside the parenthetical and a regular space (U+0020) outside. > > Example, with "--" as an en dash: Nous mangions des pommes[SP]--[NBSP]les > plus rouges au monde[NBSP]--[SP]sous les arbres. > > However, since en dashes are considered unambiguous hyphens (HH) in UAX > #14, the tailorable line breaking rule LB12a means that there *can* be a > line break after en dashes, even with a non-breaking space. LB21 also > specifies that there shouldn't be a line break before HHs. > > This is a problem in our case, as we, like many other major French > publications, use en dashes for parentheticals and therefore can't join the > opening dash and the word after with a simple NBSP. This results in the > opening dash being placed at the end of the line, which is undesirable. Em > dashes are unaffected due to being categorized as B2 rather than HH. > > We eventually decided to resort to the WORD JOINER + NBSP combo, rather > than falling back to em dashes--but it feels quite hacky. > > Therefore, I would like to suggest allowing breaks before the en dash > character (U+2013), perhaps by moving the character from HH to B2 along > with em dash. > > Regards, > Léane Grasser > > > PS. I tried to search this mailing list's archive up to 2014 and couldn't > find a discussion regarding this very topic. Sorry in advance if it's > already been discussed. > >
