Hi, i think it is clear due to Ralph's extensive analysis that this whole thing is a mess: Even looking at groff only, for historical reasons, the input sequences
- \- \(hy \(mi \(en are handled differently across output devices and across macro sets, so even using current groff alone, no consistent way exists to get the equivalent of the Unicode HYPHEN-MINUS character, even though that is important for manual pages. Besides, writing manual pages absolutely needs to be simple. Manual pages must be written by programmers who may not know typography and who are not prepared to, and shouldn't be required to, acquire specialized knowledge just to write the required manuals together with their code. So even if we would come up with some elaborate recommendation about hyphens/dashes/minuses in manual pages, it would be useless because it wouldn't be followed in practice. While i consider the above a serious issue, i'm much less worried about Ralph's concern with old implementations. Frankly, there are only three practically relevant roff implementations that are widely used for manual pages: groff, Heirloom, and mandoc. The maintainers are all active on this list and cooperate well. So we have the chance to decide something that is simple and implement it everywhere, even if it diverges somewhat from historical practice. To understand my following proposal, observe this: First, in contrast to classical typography, we need four rather than the usual three output characters (using Unicode names for clarity without intending to imply that Unicode is used as the character set by each output device): 1. U+2010 HYPHEN 2. U+2013 EN DASH 3. U+2212 MINUS SIGN 4. U+002D HYPHEN-MINUS The latter doesn't exist in normal typography, but is required for programming and hence for manual pages. In cases where you are not concerned about copy and paste but want a particular typographic representation, no matter whether in a manual page or in some other document, you can use the escape sequences \(hy \(mi \(en already now. OUTSIDE manual pages, you can also use - for \(hy (and you usually will do that) and you can use \- for \(mi (though i probably wouldn't recommend that; it mostly exists for historical reasons). INSIDE manual pages, - for \(hy or \- for \(mi is a terrible idea already now because the three main implementations (including groff) don't do that in the quite important -Tutf8 device. So here is what i propose. Let's not change anything (neither code nor recommendations) for typesetting OUTSIDE manual pages, unless there are bugs in devices. INSIDE manual pages (both -man and -mdoc), let's change - and \- to always map to U+002D HYPHEN-MINUS for all devices and let's tell people to simply use - for HYPHEN-MINUS and stop worrying. Those who care and are aware of such subtleties can use \(hy \(mi \(en in running text in manuals, but 95% of manual page authors probably won't, and that's not a problem at all. This proposal has two downsides, but i consider both very minor compared to the gain, which is having a consistent way to get U+002D HYPHEN-MINUS in manual pages, and having a very simple rule that has very good chances to actually be followed in practice and make all this easily understandable for the future. First minor downside for manual pages: Hyphens in running text that are given as - will be rendered as HYPHEN-MINUS for all devices. But that's a very minor regression because that's the case for the most important devices (ascii, utf8, html) already now. (Note that i'm not saying that utf8 is more important than ps/pdf in general - only for manual pages.) Second minor downside: Hyphen-minus signs in code elements that are given as - (which we will then encourage!) may render as U+2010 HYPHEN on some legacy systems. But that's an even smaller issue. Which legacy systems are there in the first place? Which of them support anything except ascii and latin-1? Who uses them? Will the users get upset about seeing hyphens in such cases? I suspect the answers are "very few, almost none, almost nobody, no". And if they do get upset, it will be easy for them to update their software to follow groff's lead. Assuming this is considered the right direction, how would one best implement, in doc.tmac-u and an-old.tmac, - == \- == U+002D for all devices? Yours, Ingo
