Hi Oliver, Oliver Corff wrote on Sat, May 15, 2021 at 11:39:31PM +0200:
> I try to use the correct abbreviation for the former Czechoslovak > Socialist Republic, which is U+010C SSR (C + hacek, caron, wedge). > The first attempt (enter Unicode 0x010C directly, leaving everything to > preconv(1), did not work. Works for me: $ printf '\xc4\x8cSSR' | mandoc $ printf '\xc4\x8cSSR' | groff -kT utf8 Both commands above produce the expected output for me (OpenBSD-current with no fancy configuration changes, just using the default installation). 00000000 c4 8c 53 53 52 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |..SSR...........| 00000010 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................| > Then I consulted groff_char(7) but there is no > predefined \[vC], only \[vS] etc. for base letters s, S, z and Z. No C! > I keep scratching my head. Works for me: $ printf '\\[u010C]SSR' | mandoc $ printf '\\[u010C]SSR' | groff -T utf8 Both commands above produce the expected output; specifically: $ printf '\\[u010C]SSR' | groff -T utf8 | hexdump -C 00000000 c4 8c 53 53 52 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |..SSR...........| > None of the other suggested notations (like \[u0043_030C] work (see > groff(7)) out of the box. Mandoc doesn't support that syntax, but with groff, even that works for me: $ printf '\\[u0043_030C]SSR' | mandoc -T lint mandoc: <stdin>:1:1: WARNING: invalid escape sequence: \[u0043_030C] $ printf '\\[u0043_030C]SSR' | groff -T utf8 | hexdump -C 00000000 c4 8c 53 53 52 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |..SSR...........| > .AM I don't think any fancy workarounds are needed. Yours, Ingo