Op Thu, 17 Mar 2011 00:55:32 +0100 schreef Ingo Schwarze
<schwa...@usta.de>:
Anthony J. Bentley wrote on Wed, Mar 16, 2011 at 01:37:50PM -0600:
$ mandoc -Tlint /usr/src/share/man/man4/udl.4
/usr/src/share/man/man4/udl.4:42:6:
  ERROR: skipping bad character: ignoring byte

Thanks for reporting!

Indeed, that character had to be replaced, ...

The o with diaeresis should be replaced with the \(:o escape.
(See mandoc_char(7).)

.. however the German noun "Koenig" does not contain "o diaeresis",
but "o umlaut".  As this is a common source of confusion even
among native Germans, here is my commit message to explain the
situation:

  Using mandoc_char(7) escapes like "K\(:onig" for German umlauts
  is incorrect.  The escape sequence "\(:o" represents "o diaeresis",
  not "o umlaut".  These are two very different phonological phenomena
  that only happen to be represented by the same diacritic mark.

This implies that it was a silly decision to use the same mark, which is
arguable.

  In -Tascii mode, all renderers correctly render "\(:o" (o diaeresis)
  as plain "o", but that rendering is incorrect for "o umlaut", which
  must be transliterated to the digraph "oe" in -Tascii.

That is not due to incorrect conversion, but due to missing language
information.  Your phrase "must be" is only true in a pure German-language
context, which is not applicable here.  In Dutch language for example,
when converting to ASCII it would be more correct to remove the diaeresis
for German loanwords without adding an 'e', due to different pronunciation
rules.  Regardless, the Kvnig website http://www.koniggaming.com/ uses
both Kvnig and Konig, but not Koenig.

  There is no mandoc_char(7) escape for "o umlaut",

That is no wonder, because Unicode, since version 1.0 has decided not to
distinguish between diaeresis and umlaut.  See the specification for
U+0308 and the Unicode mail list archive.  According to your explanation,
every single German text on the Internet is encoded "wrong".  But there is
no alternative[1].  Suck it up and use the diaeresis like everybody[1]
else.






[1] Yes, the exception is ISO 5426, but that is only used for collating
(sorting); it is unlikely that any widely recognized browser will ever
support it.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/
(Remove the obvious prefix to reply.)

Reply via email to