Hi, Ingo Schwarze wrote on Tue, May 02, 2017 at 03:45:01PM +0200:
> By the way, in the meantime, i also received support from NetBSD/pkgsrc > for my proposal (\- always U+002D, \(mi always U+2212). That's > Ralph, Branden, NetBSD/pkgsrc, and one relevant FreeBSD developer > then, and no protest so far from OpenBSD. I think i'll start > preparing patches and submit them to the groff bugtracker when they > are ready. Let's do that in slow, careful steps. I just submitted the first, less intrusive patch: https://savannah.gnu.org/bugs/index.php?50917 Make \- consistently render as U+002D across all macro sets and devices For easy access, i'm also appending the patch below. See the bug notes for remarks regarding non-UTF-8 output devices. Comments are welcome, and if consensus emerges, so is a commit. After that, i would then move on to the next step and submit a patch to always render unescaped "-" as U+2010 HYPHEN. Of course, it would be OK to accept this first patch but reject or delay that second one. Yours, Ingo diff --git a/font/devutf8/NOTES b/font/devutf8/NOTES index 8e724703..cd997ac2 100644 --- a/font/devutf8/NOTES +++ b/font/devutf8/NOTES @@ -6,9 +6,10 @@ Kernighan) is unmapped: bs shaded solid ball (Bell System logo, AT&T logo) -Character 0x002D has not been given a name because its Unicode name -HYPHEN-MINUS is so ambiguous that it is unusable for serious typographic -use. +Even though its Unicode name HYPHEN-MINUS is so ambiguous that it +is unusable for serious typographic use, the character 0x002D is +needed to describe the syntax of many programming languages and to +represent code samples. It can be obtained with `\-'. \[wp] has been mapped to 0x2118, because according to Unicode 4.1's NamesList.txt, U+2118 SCRIPT CAPITAL P is really a Weierstrass `p', diff --git a/man/groff_char.7.man b/man/groff_char.7.man index e38c70b9..a77ca08c 100644 --- a/man/groff_char.7.man +++ b/man/groff_char.7.man @@ -279,6 +279,9 @@ hyphen (Unicode u2010). The same output glyph can be requested explicitly with \[oq]\f(CW\e(hy\fP\[cq]. A minus sign can be obtained with \[oq]\f(CW\e(mi\fP\[cq] (Unicode u2212). +To describe programming language syntax or present code samples, +the original character can be obtained +with \[oq]\f(CW\e-\fP\[cq] (Unicode u002D). . . .TP @@ -315,6 +318,7 @@ _ \[char43] \[char43] 43 plus u002B plus \[char44] \[char44] 44 comma u002C comma \[char45] \[char45] 45 hyphen u2010 hyphen +\- \e- minus u002D hyphen-minus \[char46] \[char46] 46 period u002E period, dot \[char47] \[char47] 47 slash u002F slash \[char58] \[char58] 58 colon u003A colon @@ -753,6 +757,7 @@ _ \[r?] \e[r?] questiondown u00BF inverted question mark \[em] \e[em] emdash u2014 em-dash symbol + \[en] \e[en] endash u2013 en-dash symbol +\- \e- minus u002D ASCII hyphen-minus + \[hy] \e[hy] hyphen u2010 hyphen symbol + .TE .ad diff --git a/src/libs/libgroff/glyphuni.cpp b/src/libs/libgroff/glyphuni.cpp index 62c81654..c94a37f8 100644 --- a/src/libs/libgroff/glyphuni.cpp +++ b/src/libs/libgroff/glyphuni.cpp @@ -55,6 +55,7 @@ struct S { { "+", "002B" }, { "pl", "002B" }, { ",", "002C" }, + { "\\-", "002D" }, { ".", "002E" }, { "/", "002F" }, { "sl", "002F" }, @@ -397,11 +398,6 @@ struct S { { "product", "220F" }, { "coproduct", "2210" }, { "sum", "2211" }, - // `mi' and `\-' represent a MINUS sign. But it is used in many man pages - // to denote the U+002D character that introduces a command-line option. - // For devices that support copy&paste, such as devhtml and devutf8, the - // user can apply the workaround described in the PROBLEMS file. - { "\\-", "2212" }, { "mi", "2212" }, { "-+", "2213" }, { "**", "2217" }, diff --git a/src/libs/libgroff/uniglyph.cpp b/src/libs/libgroff/uniglyph.cpp index 3fafb225..2c773b8b 100644 --- a/src/libs/libgroff/uniglyph.cpp +++ b/src/libs/libgroff/uniglyph.cpp @@ -52,6 +52,7 @@ struct S { //{ "002B", "+" }, { "002B", "pl" }, { "002C", "," }, + { "002D", "\\-" }, { "002E", "." }, //{ "002F", "/" }, { "002F", "sl" }, @@ -390,7 +391,6 @@ struct S { { "2210", "coproduct" }, { "2211", "sum" }, { "2212", "mi" }, -//{ "2212", "\\-" }, { "2213", "-+" }, { "2217", "**" }, { "221A", "sr" }, diff --git a/tmac/an-old.tmac b/tmac/an-old.tmac index 1091e3d2..199baccd 100644 --- a/tmac/an-old.tmac +++ b/tmac/an-old.tmac @@ -676,9 +676,8 @@ .\" of easy cut and paste. . .if '\*[.T]'utf8' \{\ -. rchar \- - ' ` +. rchar - ' ` . -. char \- \N'45' . char - \N'45' . char ' \N'39' . char ` \N'96' diff --git a/tmac/doc.tmac-u b/tmac/doc.tmac-u index 8b7ad4cb..01eff136 100644 --- a/tmac/doc.tmac-u +++ b/tmac/doc.tmac-u @@ -6550,9 +6550,8 @@ .\" of easy cut and paste. . .if '\*[.T]'utf8' \{\ -. rchar \- - ' ` +. rchar - ' ` . -. char \- \N'45' . char - \N'45' . char ' \N'39' . char ` \N'96' diff --git a/PROBLEMS b/PROBLEMS index e0f1239a..c97f96f5 100644 --- a/PROBLEMS +++ b/PROBLEMS @@ -107,7 +107,6 @@ those characters back to the ASCII characters, insert the following code snippet into the `troffrc' configuration file: .if '\*[.T]'utf8' \{\ -. char \- \N'45' . char - \N'45' . char ' \N'39' .\}
