Andries,

Currently on a Linux system you find man pages in the following encodings:
  - ISO-8859-1 (German, Spanish, French, Italian, Brasilian, ...),
  - ISO-8859-2 (Hungarian, Polish, ...),
  - KOI8-R (Russian),
  - EUC-JP (Japanese),
  - UTF-8 (Vietnamese),
  - ISO-8859-7, ISO-8859-9, ISO-8859-15, ISO-8859-16 (man7/*),
and none of them contains an encoding marker.

The goal is that "groff -T... -mandoc" on any man page works, without
need to specify the encoding as an argument to groff.

There are two options:
  a) Recognize only UTF-8 encoded man pages. This is the simplest.
     groff will be changed to emit errors when it is fed a non-UTF-8
     input, so that the man page maintainers are notified that they need to
     convert their man page to UTF-8.
  b) Recognize the encoding according to a note in the first line
        '\" -*- coding: EUC-JP -*-
     groff will then emit errors when it is fed input that is non-ASCII and
     without coding: marker, so that man page maintainers are notified that
     they need to add the coding: marker.

Which of the two would you, as Linux man pages maintainer, prefer?

Bruno


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to