Follow-up Comment #8, bug #58930 (project groff): [comment #2 comment #2:] > Unicode considers U+2009 THIN SPACE and U+200A HAIR SPACE breakable... > Groff... does not offer breaking versions of these spaces, and the only > reason to add them would be strict compliance with a Unicode property > that probably no one who uses those code points actually wants
I believe my reasoning here was inaccurate. Although Unicode _allows_ breaking at a thin space or hair space, it does not _require_ it,* so groff declining to treat these as break points does not violate Unicode compliance at all. Thus I now propose that U+2009 THIN SPACE be mapped to groff's (nonbreaking) \|, and U+200A HAIR SPACE to groff's (nonbreaking) \^. * The gory details: Unicode line breaking is covered in "Unicode Standard Annex #14: Unicode Line Breaking Algorithm" (http://www.unicode.org/reports/tr14/tr14-45.html), whose introductory section makes its scope clear: "Given an input text, [this algorithm] produces a set of positions called 'break opportunities' that are appropriate points to begin a new line. The selection of actual line break positions from the set of break opportunities is not covered by the Unicode Line Breaking Algorithm, but is in the domain of higher level software." Groff declining to break at points that Unicode specifies as "break opportunities" is perfectly in line with this. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?58930> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
