Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-07 Thread Kerin Millar

On 06/08/2013 23:42, Stroller wrote:


On 6 August 2013, at 14:04, Kerin Millar wrote:

...
If undefined, the value of LC_COLLATE is inherited from LANG. I'm not sure that 
overriding it is particularly useful nowadays but it doesn't hurt.


It's been a couple of years since I looked into this, but I'm given to believe 
that LANG should set all LC_ variables correctly, and that overriding them is 
frowned upon.


As has been mentioned, there are valid reasons to want to override the 
collation. Here is a concrete example:


https://lists.gnu.org/archive/html/bug-gnu-utils/2003-08/msg00537.html

Strictly speaking, grep is correct to behave that way but it can be 
confounding. In an ideal world, everyone would be using named classes 
instead of ranges in their regular expressions but it's not an ideal world.


These days, grep no longer exhibits this characteristic in Gentoo. 
Nevertheless, it serves as a valid example of how collations for UTF-8 
locales can be a liability.


Of the other distros, Arch Linux also defined LC_COLLATE=C although I 
understand that they have just recently stopped doing that.


On a production system, I would still be inclined to use it for reasons 
of safety. For that matter, some people refuse to use UTF-8 at all on 
the grounds of security; the handling of variable-width encodings 
continues to be an effective bug inducer.



I had to do this myself because, due to a bug, the en_GB time formatting failed 
to display am or pm. I believe this should be fixed now.


Presumably:

a) LANG was defined inappropriately
b) LANG was defined appropriately but LC_TIME was defined otherwise
c) LC_ALL was defined, trumping all

I would definitely not advise doing any of these things.

--Kerin



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-07 Thread Stroller

On 7 August 2013, at 13:41, Kerin Millar wrote:

 On 06/08/2013 23:42, Stroller wrote:
 
 On 6 August 2013, at 14:04, Kerin Millar wrote:
 ...
 If undefined, the value of LC_COLLATE is inherited from LANG. I'm not sure 
 that overriding it is particularly useful nowadays but it doesn't hurt.
 
 It's been a couple of years since I looked into this, but I'm given to 
 believe that LANG should set all LC_ variables correctly, and that 
 overriding them is frowned upon.
 
 As has been mentioned, there are valid reasons to want to override the 
 collation. Here is a concrete example:
 
 https://lists.gnu.org/archive/html/bug-gnu-utils/2003-08/msg00537.html
 
 Strictly speaking, grep is correct to behave that way but it can be 
 confounding.

Linking also this answer, which you're aware of:
https://lists.gnu.org/archive/html/bug-gnu-utils/2003-08/msg00600.html

This only goes to illustrate that you shouldn't be going overriding these 
willy-nilly without full awareness of why you're doing so and what you're doing.


 I had to do this myself because, due to a bug, the en_GB time formatting 
 failed to display am or pm. I believe this should be fixed now.
 
 Presumably:
 
 a) LANG was defined inappropriately
 b) LANG was defined appropriately but LC_TIME was defined otherwise
 c) LC_ALL was defined, trumping all


I'm having trouble parsing this reply, but perhaps you might find the full bug 
description helpful. I wrote about 1000 words on the subject there last year.

It is the top Google hit for en_gb am pm bug: 
http://sourceware.org/bugzilla/show_bug.cgi?id=3768

Stroller.




Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-07 Thread Kerin Millar

On 07/08/2013 17:40, Stroller wrote:


On 7 August 2013, at 13:41, Kerin Millar wrote:


On 06/08/2013 23:42, Stroller wrote:


On 6 August 2013, at 14:04, Kerin Millar wrote:

...
If undefined, the value of LC_COLLATE is inherited from LANG. I'm not sure that 
overriding it is particularly useful nowadays but it doesn't hurt.


It's been a couple of years since I looked into this, but I'm given to believe 
that LANG should set all LC_ variables correctly, and that overriding them is 
frowned upon.


As has been mentioned, there are valid reasons to want to override the 
collation. Here is a concrete example:

https://lists.gnu.org/archive/html/bug-gnu-utils/2003-08/msg00537.html

Strictly speaking, grep is correct to behave that way but it can be confounding.


Linking also this answer, which you're aware of:
https://lists.gnu.org/archive/html/bug-gnu-utils/2003-08/msg00600.html


Best practice will never be universally observed.



This only goes to illustrate that you shouldn't be going overriding these 
willy-nilly without full awareness of why you're doing so and what you're doing.


It also served to illustrate the overall point I was making - that 
sticking to the C/POSIX collation is not without value as a safety 
measure. Naturally, I would expect anyone else to exercise their own 
judgement.






I had to do this myself because, due to a bug, the en_GB time formatting failed 
to display am or pm. I believe this should be fixed now.


Presumably:

a) LANG was defined inappropriately
b) LANG was defined appropriately but LC_TIME was defined otherwise
c) LC_ALL was defined, trumping all



I'm having trouble parsing this reply, but perhaps you might find the full bug 
description helpful. I wrote about 1000 words on the subject there last year.

It is the top Google hit for en_gb am pm bug: 
http://sourceware.org/bugzilla/show_bug.cgi?id=3768


OK.

--Kerin



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Kerin Millar

On 05/08/2013 23:52, Chris Stankevitz wrote:

On Mon, Aug 5, 2013 at 11:53 AM, Mike Gilbert flop...@gentoo.org wrote:

The handbook documents setting a system-wide default locale. You
generally do this by setting the LANG variable in
/etc/conf.d/02locale.

http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=1chap=8#doc_chap3_sect3


Mike,

Thank you for your help.  I attempted to follow these instructions and
ran into three problems.  Can you please confirm the fixes I employed
to deal with each of these issues:

1. The handbook suggests I should modify the file /etc/env.d/02locale,
but that file does not exist on my system.  RESOLUTION: create the
file


Run eselect locale, first with the list parameter and then the set 
parameter as appropriate. It's easier.




2. The handbook suggests I should add this line to
/etc/env.d/02locale: 'LANG=de_DE.UTF-8', but I do not speak the
language DE.  RESOLUTION: type instead 'LANG=en_US.UTF-8' to match
/etc/locale.gen


Legitimate locales are those installed with glibc. These can be shown 
with either eselect locale list or locale -a.




3. The handbook suggests that I should add this line to
/etc/env.d/02locale: 'LC_COLLATE=C', but I do not know if they are
again talking about the language DE.  RESOLUTION: I assumed
LC_COLLATE=C refers to english and added the line without
modification.


C refers to the POSIX locale [1].

Defining LC_COLLATE is a workaround for behaviour deeemed surprising to 
those otherwise unaware of the impact of collations. For example, files 
beginning with a dot might no longer appear at the top of a directory 
listing and ranges in regular expressions may be affected, depending on 
the extent to which a given program abides by the locale. Poorly written 
shell scripts that capture from ls (assuming a given order) might also 
be affected.


If undefined, the value of LC_COLLATE is inherited from LANG. I'm not 
sure that overriding it is particularly useful nowadays but it doesn't hurt.


--Kerin

[1] 
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html#tag_07_02




Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Bruce Hill
On Tue, Aug 06, 2013 at 02:04:00PM +0100, Kerin Millar wrote:
 
 Legitimate locales are those installed with glibc. These can be shown 
 with either eselect locale list or locale -a.

Having never used eselect with locales (AFAIR) before today.

Why does locale -a return utf8? I know UTF-8 is accepted as standard, utf8
is not but usually recognized, but want to understand why locale -a output
omits the standard, which is set on my systems, and differs from the others:

o@workstation ~ $ eselect locale list
Available targets for the LANG variable:
  [1]   C
  [2]   POSIX
  [3]   en_US.utf8
  [4]   en_US.UTF-8 *
  [ ]   (free form)
mingdao@workstation ~ $ locale -a
C
POSIX
en_US.utf8
mingdao@workstation ~ $ locale
LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE=C
LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=

Cheers,
Bruce
-- 
Happy Penguin Computers   ')
126 Fenco Drive   ( \
Tupelo, MS 38801   ^^
supp...@happypenguincomputers.com
662-269-2706 662-205-6424
http://happypenguincomputers.com/

A: Because it messes up the order in which people normally read text.   

   
Q: Why is top-posting such a bad thing? 

   
A: Top-posting. 

   
Q: What is the most annoying thing in e-mail?

Don't top-post: http://en.wikipedia.org/wiki/Top_post#Top-posting



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Kerin Millar

On 06/08/2013 14:24, Bruce Hill wrote:

On Tue, Aug 06, 2013 at 02:04:00PM +0100, Kerin Millar wrote:


Legitimate locales are those installed with glibc. These can be shown
with either eselect locale list or locale -a.


Having never used eselect with locales (AFAIR) before today.

Why does locale -a return utf8? I know UTF-8 is accepted as standard, utf8
is not but usually recognized, but want to understand why locale -a output
omits the standard, which is set on my systems, and differs from the others:

o@workstation ~ $ eselect locale list
Available targets for the LANG variable:
   [1]   C
   [2]   POSIX
   [3]   en_US.utf8
   [4]   en_US.UTF-8 *
   [ ]   (free form)
mingdao@workstation ~ $ locale -a
C
POSIX
en_US.utf8
mingdao@workstation ~ $ locale
LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE=C
LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=


Apparently, utf8 is the canonical representation in glibc (which 
provides the locale tool):


http://lists.debian.org/debian-glibc/2004/12/msg00028.html

That eselect enumerates the locale twice when the alternate form is 
specified in /etc/env.d/02locale could be considered as a minor bug.


--Kerin



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Bruce Hill
On Tue, Aug 06, 2013 at 02:40:04PM +0100, Kerin Millar wrote:
 
 Apparently, utf8 is the canonical representation in glibc (which 
 provides the locale tool):
 
 http://lists.debian.org/debian-glibc/2004/12/msg00028.html
 
 That eselect enumerates the locale twice when the alternate form is 
 specified in /etc/env.d/02locale could be considered as a minor bug.
 
 --Kerin

RFC 3629 does not mention utf8, but I did see this notation in Wikipedia, and
yes, I understand that's not official:

Other descriptions that omit the hyphen or replace it with a space, such as
utf8 or UTF 8, are not accepted as correct by the governing standards.[14]
Despite this, most agents such as browsers can understand them, and so
standards intended to describe existing practice (such as HTML5) may
effectively require their recognition.

[14] http://www.ietf.org/rfc/rfc3629.txt

I was only mildly curious seeing utf8 show up, because on numberous occasions
in #gentoo on FreeNode there have been different reports of incorrect
characters displayed with utf8, then fixed with UTF-8. Having read RFC 3629, I
just made it a habit to always use the standard (UTF-8).

Having read the remainder of the Debian ML thread you referenced, I have a
headache. Debian did that to me when I used it for ~3 months in 2003.  :-)

Cheers,
Bruce
-- 
Happy Penguin Computers   ')
126 Fenco Drive   ( \
Tupelo, MS 38801   ^^
supp...@happypenguincomputers.com
662-269-2706 662-205-6424
http://happypenguincomputers.com/

A: Because it messes up the order in which people normally read text.   

   
Q: Why is top-posting such a bad thing? 

   
A: Top-posting. 

   
Q: What is the most annoying thing in e-mail?

Don't top-post: http://en.wikipedia.org/wiki/Top_post#Top-posting



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Kerin Millar

On 06/08/2013 15:26, Bruce Hill wrote:

On Tue, Aug 06, 2013 at 02:40:04PM +0100, Kerin Millar wrote:


Apparently, utf8 is the canonical representation in glibc (which
provides the locale tool):

http://lists.debian.org/debian-glibc/2004/12/msg00028.html

That eselect enumerates the locale twice when the alternate form is
specified in /etc/env.d/02locale could be considered as a minor bug.

--Kerin


RFC 3629 does not mention utf8, but I did see this notation in Wikipedia, and
yes, I understand that's not official:

Other descriptions that omit the hyphen or replace it with a space, such as
utf8 or UTF 8, are not accepted as correct by the governing standards.[14]
Despite this, most agents such as browsers can understand them, and so
standards intended to describe existing practice (such as HTML5) may
effectively require their recognition.

[14] http://www.ietf.org/rfc/rfc3629.txt


Internally, glibc may use whatever representation it pleases.


I was only mildly curious seeing utf8 show up, because on numberous occasions
in #gentoo on FreeNode there have been different reports of incorrect
characters displayed with utf8, then fixed with UTF-8. Having read RFC 3629, I
just made it a habit to always use the standard (UTF-8).


Probably due to buggy applications. According to a glibc maintainer, 
they should be using the nl_langinfo() function but some try to read the 
locale name itself. The response of both of these commands is the same:


# LC_ALL=en_US.UTF-8 locale -k LC_CTYPE | grep charmap
# LC_ALL=en_US.utf8  locale -k LC_CTYPE | grep charmap

Ergo, applications that use the correct interface will be informed that 
the character encoding is UTF-8, irrespective of the format of the 
locale name.


Given the above, sticking to the lang_territory.UTF-8 format seems 
wise.




Having read the remainder of the Debian ML thread you referenced, I have a
headache. Debian did that to me when I used it for ~3 months in 2003.  :-)

Cheers,
Bruce





Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Mike Gilbert
On Mon, Aug 5, 2013 at 6:52 PM, Chris Stankevitz
chrisstankev...@gmail.com wrote:
 On Mon, Aug 5, 2013 at 11:53 AM, Mike Gilbert flop...@gentoo.org wrote:
 The handbook documents setting a system-wide default locale. You
 generally do this by setting the LANG variable in
 /etc/conf.d/02locale.

 http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=1chap=8#doc_chap3_sect3

 Mike,

 Thank you for your help.  I attempted to follow these instructions and
 ran into three problems.  Can you please confirm the fixes I employed
 to deal with each of these issues:


I think the other responses in the thread have this covered, but I
will respond anyway.

 1. The handbook suggests I should modify the file /etc/env.d/02locale,
 but that file does not exist on my system.  RESOLUTION: create the
 file


Correct. This file can also be created by using eselect locale.

 2. The handbook suggests I should add this line to
 /etc/env.d/02locale: 'LANG=de_DE.UTF-8', but I do not speak the
 language DE.  RESOLUTION: type instead 'LANG=en_US.UTF-8' to match
 /etc/locale.gen


Right, the de_DE is just an example. You should select a
language/country that matches your lingual ability. :-)

 3. The handbook suggests that I should add this line to
 /etc/env.d/02locale: 'LC_COLLATE=C', but I do not know if they are
 again talking about the language DE.  RESOLUTION: I assumed
 LC_COLLATE=C refers to english and added the line without
 modification.


LC_COLLATE specifies how to sort text strings. Setting it to C
indicates that you want to sort strings based on the binary (ASCII)
value of their characters.

Leaving LC_COLLATE unset will cause strings to be sorted according to
the normal rules associated with your locale.

For example, given the following strings:

cat
Dog

With LC_COLLATE=C, they are sorted like this, since the binary value
of D (66) is less than the value of c (99).

Dog
cat

With LC_COLLATE=en_US.UTF-8, they are sorted like this, since c
comes before D in the alphabet.

cat
Dog



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Chris Stankevitz
On Tue, Aug 6, 2013 at 6:04 AM, Kerin Millar kerfra...@fastmail.co.uk wrote:
 Run eselect locale, first with the list parameter and then the set
 parameter as appropriate. It's easier.


Kerin, all,

Thank for your help.  SVN (and I'm sure other apps) are happy now.

Chris



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Chris Stankevitz
On Tue, Aug 6, 2013 at 8:13 AM, Mike Gilbert flop...@gentoo.org wrote:
 Leaving LC_COLLATE unset will cause strings to be sorted according to
 the normal rules associated with your locale.

Mike (or anyone else),

For which applications does setting LC_COLLATE affect sorting:

a) Any C++ application that uses bool std::string::operator(const std::string)

b) Any C or C++ application that compares char values using the '' operator

c) Any application that uses the system call CompareStrings(const
char*, const char*)

d) [your answer here]

I'm sure the answer is not a or b.  I'm sure it's not c either since I
just made it up.

Thank you,

Chris



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Stroller

On 6 August 2013, at 14:04, Kerin Millar wrote:
 ...
 If undefined, the value of LC_COLLATE is inherited from LANG. I'm not sure 
 that overriding it is particularly useful nowadays but it doesn't hurt.

It's been a couple of years since I looked into this, but I'm given to believe 
that LANG should set all LC_ variables correctly, and that overriding them is 
frowned upon.

I had to do this myself because, due to a bug, the en_GB time formatting failed 
to display am or pm. I believe this should be fixed now.

Stroller.
 


Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-06 Thread Mike Gilbert
On Tue, Aug 6, 2013 at 2:23 PM, Chris Stankevitz
chrisstankev...@gmail.com wrote:
 On Tue, Aug 6, 2013 at 8:13 AM, Mike Gilbert flop...@gentoo.org wrote:
 Leaving LC_COLLATE unset will cause strings to be sorted according to
 the normal rules associated with your locale.

 Mike (or anyone else),

 For which applications does setting LC_COLLATE affect sorting:

 a) Any C++ application that uses bool std::string::operator(const 
 std::string)

 b) Any C or C++ application that compares char values using the '' operator

 c) Any application that uses the system call CompareStrings(const
 char*, const char*)

 d) [your answer here]

 I'm sure the answer is not a or b.  I'm sure it's not c either since I
 just made it up.


From locale(7):

   LC_COLLATE
  This is used to change the behavior of the  functions
strcoll(3)  and  strxfrm(3),
  which  are  used to compare strings in the local
alphabet.  For example, the German
  sharp s is sorted as ss.



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-05 Thread Mike Gilbert
On Mon, Aug 5, 2013 at 2:25 PM, Chris Stankevitz
chrisstankev...@gmail.com wrote:
 Hello,

 I am using svn to update a repository.  Somebody added files to the
 repository with weird characters in the filename.  SVN refuses to
 update the respository unless I first:

 export LC_CTYPE=en_US.UTF-8

 I don't know or really care what that mumbo jumbo means, but I would
 like an answer to this question:

 Is my gentoo system properly setup?  If not, what step did I miss that
 is causing svn to want me to export LC_CTYPE?

 I suspect either my gentoo system is messed up or svn is messed up.


Sparing you the details as requested: In general, you want to be using
a locale that ends with .UTF-8 to avoid encoding issues with
software like python and subversion.

The handbook documents setting a system-wide default locale. You
generally do this by setting the LANG variable in
/etc/conf.d/02locale.

http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=1chap=8#doc_chap3_sect3



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-05 Thread Bruce Hill
On Mon, Aug 05, 2013 at 02:53:11PM -0400, Mike Gilbert wrote:
 On Mon, Aug 5, 2013 at 2:25 PM, Chris Stankevitz
 chrisstankev...@gmail.com wrote:
  Hello,
 
  I am using svn to update a repository.  Somebody added files to the
  repository with weird characters in the filename.  SVN refuses to
  update the respository unless I first:
 
  export LC_CTYPE=en_US.UTF-8
 
  I don't know or really care what that mumbo jumbo means, but I would
  like an answer to this question:
 
  Is my gentoo system properly setup?  If not, what step did I miss that
  is causing svn to want me to export LC_CTYPE?
 
  I suspect either my gentoo system is messed up or svn is messed up.
 
 
 Sparing you the details as requested: In general, you want to be using
 a locale that ends with .UTF-8 to avoid encoding issues with
 software like python and subversion.
 
 The handbook documents setting a system-wide default locale. You
 generally do this by setting the LANG variable in
 /etc/conf.d/02locale.
 
 http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=1chap=8#doc_chap3_sect3
 
Without looking, shouldn't that be /etc/env.d/02locale ?
-- 
Happy Penguin Computers   ')
126 Fenco Drive   ( \
Tupelo, MS 38801   ^^
supp...@happypenguincomputers.com
662-269-2706 662-205-6424
http://happypenguincomputers.com/

A: Because it messes up the order in which people normally read text.   

   
Q: Why is top-posting such a bad thing? 

   
A: Top-posting. 

   
Q: What is the most annoying thing in e-mail?

Don't top-post: http://en.wikipedia.org/wiki/Top_post#Top-posting



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-05 Thread Mike Gilbert
On Mon, Aug 5, 2013 at 2:57 PM, Bruce Hill
da...@happypenguincomputers.com wrote:
 On Mon, Aug 05, 2013 at 02:53:11PM -0400, Mike Gilbert wrote:
 On Mon, Aug 5, 2013 at 2:25 PM, Chris Stankevitz
 chrisstankev...@gmail.com wrote:
  Hello,
 
  I am using svn to update a repository.  Somebody added files to the
  repository with weird characters in the filename.  SVN refuses to
  update the respository unless I first:
 
  export LC_CTYPE=en_US.UTF-8
 
  I don't know or really care what that mumbo jumbo means, but I would
  like an answer to this question:
 
  Is my gentoo system properly setup?  If not, what step did I miss that
  is causing svn to want me to export LC_CTYPE?
 
  I suspect either my gentoo system is messed up or svn is messed up.
 

 Sparing you the details as requested: In general, you want to be using
 a locale that ends with .UTF-8 to avoid encoding issues with
 software like python and subversion.

 The handbook documents setting a system-wide default locale. You
 generally do this by setting the LANG variable in
 /etc/conf.d/02locale.

 http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=1chap=8#doc_chap3_sect3

 Without looking, shouldn't that be /etc/env.d/02locale ?

Yes.

Or /etc/locale.conf if you're on systemd.



Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8

2013-08-05 Thread Chris Stankevitz
On Mon, Aug 5, 2013 at 11:53 AM, Mike Gilbert flop...@gentoo.org wrote:
 The handbook documents setting a system-wide default locale. You
 generally do this by setting the LANG variable in
 /etc/conf.d/02locale.

 http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=1chap=8#doc_chap3_sect3

Mike,

Thank you for your help.  I attempted to follow these instructions and
ran into three problems.  Can you please confirm the fixes I employed
to deal with each of these issues:

1. The handbook suggests I should modify the file /etc/env.d/02locale,
but that file does not exist on my system.  RESOLUTION: create the
file

2. The handbook suggests I should add this line to
/etc/env.d/02locale: 'LANG=de_DE.UTF-8', but I do not speak the
language DE.  RESOLUTION: type instead 'LANG=en_US.UTF-8' to match
/etc/locale.gen

3. The handbook suggests that I should add this line to
/etc/env.d/02locale: 'LC_COLLATE=C', but I do not know if they are
again talking about the language DE.  RESOLUTION: I assumed
LC_COLLATE=C refers to english and added the line without
modification.

Thank you again for your help,

Chris