Matthew Burgess wrote:
2) The i18n patch isn't going to be accepted in its current state, which I already suspected. It's incomplete and makes the code harder to maintain. I'm currently waiting on feedback on how to proceed from here.

Either disagree with the maintainers (because it is simply necessary for an acceptable level of UTF-8 support, not only in coreutils), or drop UTF-8 support completely from LFS and BLFS (because it is not ready in "unpatched upstream" and will never be, and because no other LFS project implemented it, and LFS is intended to be a "minimal base"). UTF-8 is is a nightmare to maintain, creates a rather big but unwritten blacklist of packages, and it makes experimentation with other areas on the LiveCD difficult for other maintainers (they are afraid to break the thing).

Also, Microsoft's approach to Unicode (keep a 8-bit encoding for legacy applications and support UCS, not UTF representation of Unicode) is technically superior and that's what is implemented when one uses Qt-based GUI applications in non-UTF-8 locales. Too bad that it is impossible to implement proper in-kernel NFSv4 support with this approach.

If you drop only the coreutils patch (as opposed to all UTF-8 support), add the following note to the book:

{{{
Many other distributions apply the so-called i18n patch to coreutils. It originates from the OpenI18N group and is currently maintained by RedHat. The patch makes changes necessary for "cut", "pr", "uniq", "expand", "fold", "join", "unexpand" and "sort" to process multibyte characters correctly. Without the patch, the following issues occur:

1) "cut" has no way to take n characters (as opposed to n bytes), and can damage the last character by cutting in the middle of it. 2) "fold" uses number of bytes, not number of character cells, to decide where to fold the string. The result is premature folding or breaking the string in the middle of a multibyte character (a no-no). 3) Utilities that take a separator character as a command-line parameter cannot be told to use a multibyte character as a separator.
4) The OpenI18N testsuite (required for LSB certification) doesn't pass.

However, the patch has been rejected by upstream maintainers of coreutils, because it's incomplete (e.g., the "tr :upper: :lower:" command doesn't work correctly in multibyte locales even with the patch) and makes the code harder to maintain. Thus, if you have to process non-ASCII text in UTF-8 locales, you have to do it with other utilities, such as Perl.
}}}

Also note that the patch exists for 5 years (!!!) and is still not in the acceptable shape. Looks like parties (like RedHat and LSB) that are interested in the results that the patch gives are perfectly OK with the deviation.

3) The suppress-uptime-kill-su patch is obviously Linux specific, so isn't suitable for upstream.
s/Linux/LFS/

4) We currently use a sed to avoid a supposed buffer overflow in translated versions of `who'. This is unnecessary now as it's been fixed in a different manner, so the sed can be removed from the book.
Well, that's partially correct, see the existing code from coreutils-5.96:

 if (include_idle && !short_output && strlen (idle) < sizeof x_idle - 1)
   sprintf (x_idle, " %-6s", idle);
 else
   *x_idle = '\0';

This means that, if the string doesn't fit, it will be deleted from the header completely (thus, you are right that there is no overflow). But that's still not perfect, because of misaligning of headers with their columns. The sed substitution that disables i18n for the "who" program makes the output better:

sed -i '/config.h/a#undef ENABLE_NLS' src/who.c

--
Alexander E. Patrakov
--
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to