Greg Schafer wrote:

Correct. However, the problem is with Readline and not Bash.

You are of course right. Thanks for the correction. I noticed this in Chapter 5 as a difference between Glibc-based LFS and uClibc-based HLFS. That's why I thought it's bash.

Now grep the Readline source for NON_NEGATIVE and you can see how it makes
a difference.
It seems that _rl_lowercase_p(c) macro uses this, and is used in vi_mode.c

Some other comments:

- The configure check in question is skipped when cross compiling
Correct, and a pessimistic assumption is made then.

- The Bash testsuite will gain greater coverage if the locale en_US.UTF-8
  is installed
Already covered in LFS, because of patched coreutils. Thanks anyway for the report.

- If you're wise you will not rely on anything I (or DIY) say about
  internationalization. Me being a native English speaker I have no
  personal need for it and because I'm a selfish b*stard I'm not likely
  to become an expert anytime soon :-)
Point taken.

- I'd appreciate it if you could report DIY issues to the DIY list or to
  me personally. Thanks.
Do I understand correctly that bugs which affect both books have to be reported to both lists?

PS - Slightly OT but in case you haven't seen it, there is a reasonable
(but possibly inaccurate) writeup of UTF-8 issues here:

http://www.linux.com/article.pl?sid=06/01/26/210214
Indeed, incomplete and inaccurate. But still partially useful.

Some of inaccuracies below are already mentioned in comments.

1) The article assumes a distro where a distro-maker has applied all UTF-8 patches 2) Option "XkbLayout" "en_us.utf-8" is just wrong, UTF-8 and other people use the same X keyboard layout
3) Missing link to DejaVu fonts
4) export GTK_IM_MODULE=xim works, but is suboptimal if the input method program supports GTK2 natively. In such case (e.g. with SCIM), export GTK_IM_MODULE=scim yields better results, e.g. avoids locking the whole system if one program misbehaves. Setting up XIM itself (for non-GTK apps) is not described at all. 5) "For touch typists used to a legacy English locale, probably the greatest annoyance is the need to press the apostrophe key followed by the space bar to type a straight quotation mark." Wrong, applies only to a wronly chosen XkbLayout which contains dead keys. Plain "us" layout (that doesn't contain dead keys) works as before. 6) "enter extended HTML characters in decimal format." Wrong, you don't need it if you serve UTF-8 HTML pages with the correct content-type "text/html; charset=UTF-8". 7) "you may need to configure Mozilla to use xprint." Was true not too long ago. Wrong for Seamonkey-1.0, which features Freetype-based printing and Pango-based layout. 8) "BASH can display UTF-8 characters, but cannot accept them as input. This limitation may be due to legacy system fonts, or to limitations in UTF-8 support in the kernel." This is the kernel issue, which is solved in the LFS book BTW. I was asked to submit the kernel patch for 2.6.17.
||

PPS - Some folks have asked me about UTF-8. In short, I have no clue. I
admire the distros for forging ahead. I also admire the work Alex has done
to try and get it working for the rest of us because it's clearly the way
of the future. But for me right now, I feel it's simply too disruptive and
100K+ patches are not on my agenda. I'd rather wait for upstream packages
to catch up before trying to support UTF-8. Yes, it might take a while..
You can as well implement (but see below) a minimal acceptable subset of what is in the LFS book:

1) ncursesw
2) grep "bracket" patch from RedHat
3) Pass "+lang none" to Man, say that it's the reader's responsibility to remove all non-ISO-8859-1 manual pages, don't install them yourself with Shadow. Point readers who want non-ISO-8859-1 manual pages to my "man-i18n" hint.
4) sysklogd 8-bit patch
5) mention that the contents of /etc/inittab is closely tied with bootscripts
6) Pass --enable-multibyte to vim
7) Recommend that the X Window System should be used instead of text console with UTF-8 locales due to a kernel bug related to keyboard input.

As you see, 100K-patches are outside of the proposed steps :) But the fact is that (due to LSB) patched coreutils, grep and diffutils are tested better than non-patched ones.

Rationale for not including other UTF-8 stuff that is in LFS:

1) 100K-patches: they are not a complete solution and don't stop me from abusing perl.
2) Glibc-libidn: it is optional because there are not many IDN sites
3) Removal of non-working Glibc locales: one may as well just not use them.
4) Debian Groff-1.18.1.1 + DB + Man-DB: Jim Gifford says it's overkill. Groff-1.19.2 + Man works fine for formatting ISO-8859-1 manuals. Hacks exist to make it show Russian. Japanese does require the Man setup in the LFS book. So does the LiveCD configuration where the need to reconfigure anything except $LANG is automatically a bug. 5) Kbd backspace fix: nice but optional because the bs-sends-del keymap snippet from LFS-6.1 works.
6) Texinfo patch fixes a purely cosmetical issue.
7) Bootscripts and the kernel: they are already outside of DIY.

But it is indeed better to say "no" to UTF-8 support now, because of BLFS issues and because UTF-8 support is a property of the whole system, not just its base. The blacklist approach ("If a package is not listed here, it means there are no known locale specific issues or problems with that package" on http://www.linuxfromscratch.org/blfs/view/svn/introduction/locale-issues.html) effectively defeats the purpose of the page because the blackilst is always incomplete.

Yes, this does mean that UTF-8 support in LFS is next to useless now.

--
Alexander E. Patrakov
--
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to