Matthew Burgess wrote:
2) The i18n patch isn't going to be accepted in its current state,
which I already suspected. It's incomplete and makes the code harder
to maintain. I'm currently waiting on feedback on how to proceed from
here.
Either disagree with the maintainers (because it is simply necessary for
an acceptable level of UTF-8 support, not only in coreutils), or drop
UTF-8 support completely from LFS and BLFS (because it is not ready in
"unpatched upstream" and will never be, and because no other LFS project
implemented it, and LFS is intended to be a "minimal base"). UTF-8 is is
a nightmare to maintain, creates a rather big but unwritten blacklist of
packages, and it makes experimentation with other areas on the LiveCD
difficult for other maintainers (they are afraid to break the thing).
Also, Microsoft's approach to Unicode (keep a 8-bit encoding for legacy
applications and support UCS, not UTF representation of Unicode) is
technically superior and that's what is implemented when one uses
Qt-based GUI applications in non-UTF-8 locales. Too bad that it is
impossible to implement proper in-kernel NFSv4 support with this approach.
If you drop only the coreutils patch (as opposed to all UTF-8 support),
add the following note to the book:
{{{
Many other distributions apply the so-called i18n patch to coreutils. It
originates from the OpenI18N group and is currently maintained by
RedHat. The patch makes changes necessary for "cut", "pr", "uniq",
"expand", "fold", "join", "unexpand" and "sort" to process multibyte
characters correctly. Without the patch, the following issues occur:
1) "cut" has no way to take n characters (as opposed to n bytes), and
can damage the last character by cutting in the middle of it.
2) "fold" uses number of bytes, not number of character cells, to decide
where to fold the string. The result is premature folding or breaking
the string in the middle of a multibyte character (a no-no).
3) Utilities that take a separator character as a command-line parameter
cannot be told to use a multibyte character as a separator.
4) The OpenI18N testsuite (required for LSB certification) doesn't pass.
However, the patch has been rejected by upstream maintainers of
coreutils, because it's incomplete (e.g., the "tr :upper: :lower:"
command doesn't work correctly in multibyte locales even with the patch)
and makes the code harder to maintain. Thus, if you have to process
non-ASCII text in UTF-8 locales, you have to do it with other utilities,
such as Perl.
}}}
Also note that the patch exists for 5 years (!!!) and is still not in
the acceptable shape. Looks like parties (like RedHat and LSB) that are
interested in the results that the patch gives are perfectly OK with the
deviation.
3) The suppress-uptime-kill-su patch is obviously Linux specific, so
isn't suitable for upstream.
s/Linux/LFS/
4) We currently use a sed to avoid a supposed buffer overflow in
translated versions of `who'. This is unnecessary now as it's been
fixed in a different manner, so the sed can be removed from the book.
Well, that's partially correct, see the existing code from coreutils-5.96:
if (include_idle && !short_output && strlen (idle) < sizeof x_idle - 1)
sprintf (x_idle, " %-6s", idle);
else
*x_idle = '\0';
This means that, if the string doesn't fit, it will be deleted from the
header completely (thus, you are right that there is no overflow). But
that's still not perfect, because of misaligning of headers with their
columns. The sed substitution that disables i18n for the "who" program
makes the output better:
sed -i '/config.h/a#undef ENABLE_NLS' src/who.c
--
Alexander E. Patrakov
--
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page