Hello Rogier, Rogier Wolff wrote: > Guys, I've been using Unix systems for over twenty years. I see that > other people manage to get their Unix systems to "talk" to them in > Dutch: "Bestand is niet gevonden". Besides that this sentence is wrong > in a lot of contexts, I'm used to "file not found".
As a side note to this discussion, if you find translation errors or improvements and can contribute fixes that would be great. Please send those to the translation team. The address is usually found in the .po file which contains the translation. They are the ones who will know best what to do to correct it. > When I install a modern system, (possibly through debootstrap, a > chrooted or nfs-mounted root setup), perl complains loudly about some > LC_ variable not being set. The way I've found to get it to shut up, > and get a sane, working apt-setup is to install "locales" whatever > that may mean. I then have to select somthing that starts with "en" to > get the system to speak english to me. Perl complains to you because you apparently have LANG set without the corresponding locale installed. It doesn't complain if you don't have it set and therefore you must have LANG set to some locale. This was apparently set for you without your knowledge. Having been set all of the rest of the problem follows. I see the same thing here. You can prevent perl from complaining by unsetting LANG and any other LC_* environment variable that is set. With no LANG nor LC_* variables set the default locale is the traditional C locale. This is also standardized by POSIX and is also known as the POSIX locale. The strings "C" and "POSIX" are equivalent but most typically us traditionalists use "C" as an emphasis that it is the traditional behavior that we are setting. Setting LANG=C is the same as not setting it at all. HOWEVER! The LC_ALL variable overrides all other variables. If you have LC_ALL set then it doesn't matter what you have other variables set to. LC_ALL is the highest priority override. And similar for LC_COLLATE and other LC_* variables which override LANG unless LC_ALL is set. Therefore we *must* talk about the LC_* variables when talking about LANG. But hate doing so since it makes the conversation so messy. Much easier just to say that setting LC_ALL=C is the biggest available override. This is also often seen in scripts to force standard behavior. > Apparently one of these steps has the side effect of changing the sort > order. You don't like it and I don't like it but the-powers-that-be have confused working with data on a computer with talking about working with data on a computer. They have decided that the collation ordering (sort ordering) for data should be dictionary ordering. In dictionary ordering case is folded together and punctuation is ignored. By having LANG set to any of the "en" locales the system is instructed to use dictionary sort ordering. This affects almost everything on the system that sorts. > I just want the system to talk english to me, and simply sort > my directories in the "normal" order. I don't even know where the LANG > variable is set. I don't want to have to find out. It is not mentioned > in your FAQ. The FAQ entry was originally written this problem first hit people and before anyone understood the details of the problem and needs to be rewritten. Since you mention APT what you want to do is to reconfigure the locales on your system with 'dpkg-reconfigure locales'. When it asks you for the "Default locale for the system environment:" select "None". This will remove the setting of LANG from /etc/environment and remember that it shouldn't be set. I am sure that by default it is placing a dictionary sort order there. This is a distribution specific configuration and every operating system does it differently. However setting the standard locale (the C/POSIX/none locale is the standard locale, all others are non-standard) will have other affects. The setting is usually used to control whether graphics terminals support unicode/UTF-8 characters and other i18n behavior. Turning it off will probably prevent you from using non-ASCII characters. That is often not acceptable. What I do to compromise is to set LANG=en_US.UTF-8 but also set LC_COLLATE=C to force a standard sort order regardless. I put this in my $HOME/.bashrc file. export LANG=en_US.UTF-8 export LC_COLLATE=C > Your explanation is fine. But it should be in the FAQ. The FAQ tells > me that if it sorts weird, I have an LC_... variable set. Well, in actuality it says: "You or your vendor have probably set environment variables like LANG, LC_ALL, or LANG to en_US." It doesn't say that you have LC_ALL set. It says that you have one of the many variables listed set. I believe that is true. Definitely setting 'export LC_ALL=C' will force a standard sort ordering. The shell reads this at start time only so you would need to start a new shell to have it take effect for shell sort operations such as "*" file globbing. > I didn't have an LC_... variable set, and still it sorted wrong. Then you *must* have had LANG or LC_COLLATE set. You can print your locale settings with the 'locale' command. $ locale > I'm smart enough to finally figure out it was a LANG variable, and > you're intimate enough with the workings of all this to explain the > order in which the different setting variables are tried. However, > as it stands I had no chance to find accurate information in the > FAQ. You are right that the FAQ entry needs an update. If you have suggestions for improvements there that would be great. I will queue up some time to work on improving it. Bob _______________________________________________ Bug-coreutils mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-coreutils
