Federico Lucifredi wrote:
(please CC: at least lfs-dev)

Hello Jim, Alex,
Yes, I have plans for (1) and (2), so there should be no particular
problem getting that fixed. I have not thought extensively of the
interaction problems with groff, tho, so that is next on the list. for
(3), I would like to know exactly what behavior you would like to see --
install time switch, commandline at invokation time, or what else.
3) Feature request: it would be very nice if LFS obtains a way to tell Man to ignore /usr/share/man/ja/* even in Japanese locales, because the system's Groff-1.19.{2,3cvs} can't format those manuals. This also applies to other languages, and maybe it is better to implement as a whitelist, not blacklist. This whitelist should be different for printing and display purposes.

I want a new option in man.conf. Basically, it's a way for root of a box (which, e.g., has many Russian users but allows one German person to ssh in and override $LANG), to say: "I have created a man setup that works only for Russian, don't attempt to use it in other situations". The idea is to never misformat a manual page unless the "yes --help" output would also be misformatted. Fallback to English is much better than misformatting.

This approach is different from the one of Man-DB where it knows good defaults for many locales (but still misformats Chinese manuals in zh_CN.UTF-8).

Let's name the option "TRANSLATIONS" for now. Its value should be a colon-separated list of the following allowable items:

1) language names, such as de, fr,..., that contain only letters.
2) the special tokens "8bit" and "utf8"
3) exact locale names, such as ja_JP. They can be distinguished from (1) by the presence of non-letters.

Similarly, HARDCOPY_TRANSLATIONS can list the same items.

The distributor or whoever else creates man.conf should set these options to the list of languages/locales where manual pages are known to be formatted properly by *roff and other programs mentioned in man.conf. See below for the exact meaning.

Currently, Man looks at locale environment variables and the LANGUAGE variable to determine the list of search paths for translated manual pages, and picks up the first manual page that exists in the search path. Proposed change: ignore bad manual pages, according to the following rules:

1) if MB_CUR_MAX==1 in the current locale, and the special "8bit" token is precent in TRANSLATIONS, any manual page is good. The use case is the current -Tlatin1 setup which formats all manuals properly in 8-bit locales if they are stored in 8-bit language-specific encodings. 2) if the currenl locale is a UTF-8 based locale, and the "utf8" token is present in TRANSLATIONS, any manual page is good. The use case is RedHat's groff. 3) if the exact current locale is listed in TRANSLATIONS, and the manual page doesn't come from the LANGUAGE environment variable, it is good. The use case is: format Japanese manuals with -Tnippon on Debian-patched groff. 4) if the manual page's language is listed as a language (not locale!) in TRANSLATIONS, it is good. Use case 1: no -Txxxx switch, rely upon the nroff script itself to figure out the correct device for ISO-8859-1-encoded manuals. Use case 2: the -K switch for the new groff (because the argument correct for Russian manual pages is wrong for German ones).
5) English manual pages are always good.

BTW, I would like to avoid (4.2) and the whole need for the administrator to hardcode the right -K ... argument, because this can be deduced from the manual page location. The old idea about /usr/share/man/$lang/.charset file may help, maybe it would be a good idea to introduce the "%C" token that would expand to that, and "%K" that would expand to "-K %C" if "%C" expands to non-empty string. The default value for "%C" for manual pages specified with the full path is a tough question, I will think more about it. Maybe: empty string.

Any other manual page is bad and should be treated as a non-existing file.

The good default would be (assuming the Debian-specific manual page encoding, -Tlatin1 default nroff argument, and the "less -isR" pager that is able to convert from ISO-8859-1 to UTF-8 on the fly):

TRANSLATIONS 8bit:da:de:en:es:fi:fr:ga:gl:id:is:it:nb:nl:nn:no:pt:sv
HARDCOPY_TRANSLATIONS da:de:en:es:fi:fr:ga:gl:id:is:it:nb:nl:nn:no:pt:sv

--
Alexander E. Patrakov
--
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to