Re: Mutt showing ? in place of space
Hello Derek, On 2024-03-29 16:47:22, Derek Martin wrote: > On Sat, Mar 23, 2024 at 07:41:45PM +0800, Sadeep Madurange wrote: > > Initially, LANG was unset and LC_CTYPE="C". The character encoding > > was US-ASCII. I changed these variables (i.e., LANG, LC_CTYPE and > > locale settings) to en_US.UTF-8. Then the ? changed to ?. So, looks > > like you are on to something. I will check this with OpenBSD > > community as well. > > > > In Xdefaults, I have set XTerm*utf-8 setting to true as well. > > Your problem is that these settings are not consistent (and you still > have this problem, because the "solution" proposed by Sirius is > incorrect--even if it appears to have solved your issue). By having > LANG unset, you've told your shell (and therefore everything started > by it) to use ASCII, but you've explicitly told xterm to use Unicode. > That's wrong. > > The TL;DR of this is: > > 1. You should NEVER need to set Mutt's charset explicitly. [*] > 2. Your shell, Mutt, and X should all inherit what they need from your >LANG environment variable, assuming it is set properly for your >system and environment (it definitely isn't in your case). > 3. Setting Mutt's charset may appear to "work" but it's not the >correct solution, because your shell and terminal settings are >still inconsistent. You'll have trouble with other things later if >you don't fix this. > > [*] Except in extremely rare and completely esoteric cases that apply > only to experts... and by now should really apply to no one. Thank you for patiently explaining. That was very educational. This is what I used to do on Linux in the past, though, without knowing why. Unfortunately, this doesn't seem to work on OpenBSD. So, perhaps this qualifies as one of the esoteric cases. OpenBSD doesn't seem to pay much attention to the LANG variable. Thw following is an excerpt from the locale man page: "Programs in the OpenBSD base system ignore the locale except for the character encoding, and it is not recommended to use any of these variables except that the following non-default setting is supported as an option: export LC_CTYPE=en_US.UTF-8" [1] [1] https://man.openbsd.org/locale.1 -- Sadeep Madurange PGP: 103BF9E3E750BF7E
Re: Mutt showing ? in place of space
On Fri, Mar 29, 2024 at 03:03:54PM -0700, Bob Crochelt wrote: > Thank you Derek! > Bob Crochelt No problem, glad I could help. I had to figure this stuff all out the hard way 20 years ago, when I moved to Korea and suddently needed to be able to type in both English and Korean... I've been answering it on this list ever since... seriously, see this post from 2006: https://marc.info/?l=mutt-users&m=114434601817142&w=2 Things were A LOT worse then--Unicode hadn't really been adopted very much in 2004. Some of the finer details of how (or rather where) to set those settings keep changing (in part because young developers can't leave well enough alone when a solution has been working for literally decades), and most people don't need to futz with fonts anymore since Unicode is the default everywhere now, and the modern font rendering libraries handle all this automatically--but the basics of the problem and its solution are still the same. In another such post I commented that this topic comes up often enough that it really ought to be a FAQ. Some day I'll take the time to collect the details in a bit better format and provide one... -- Derek D. Martinhttp://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail due to spam prevention. Sorry for the inconvenience. > On Fri, Mar 29, 2024, at 13:47, Derek Martin wrote: > > On Sat, Mar 23, 2024 at 07:41:45PM +0800, Sadeep Madurange wrote: > >> Initially, LANG was unset and LC_CTYPE="C". The character encoding was > >> US-ASCII. I changed these variables (i.e., LANG, LC_CTYPE and locale > >> settings) to en_US.UTF-8. Then the ? changed to ?. So, looks like you > >> are on to something. I will check this with OpenBSD community as well. > >> > >> In Xdefaults, I have set XTerm*utf-8 setting to true as well. > > > > Your problem is that these settings are not consistent (and you still > > have this problem, because the "solution" proposed by Sirius is > > incorrect--even if it appears to have solved your issue). By having > > LANG unset, you've told your shell (and therefore everything started > > by it) to use ASCII, but you've explicitly told xterm to use Unicode. > > That's wrong. > > > > The TL;DR of this is: > > > > 1. You should NEVER need to set Mutt's charset explicitly. [*] > > 2. Your shell, Mutt, and X should all inherit what they need from your > >LANG environment variable, assuming it is set properly for your > >system and environment (it definitely isn't in your case). > > 3. Setting Mutt's charset may appear to "work" but it's not the > >correct solution, because your shell and terminal settings are > >still inconsistent. You'll have trouble with other things later if > >you don't fix this. > > > > [*] Except in extremely rare and completely esoteric cases that apply > > only to experts... and by now should really apply to no one. > > > > > > The unfortunately lengthy details: > > -- > > > > Displaying characters properly is actually tricky business on modern > > computers, because of the legacy methods by which we tried to > > accommodate different languages, and the (relatively) recent advent of > > Unicode to unify that mess. All of the following must be set > > consistently: Your shell, your terminal program (or your operating > > system's console), your font, all of your application programs, and > > when appropriate, the X window system. If any of these are not > > consistently set, you can, and eventually WILL, have trouble. Most > > modern systems have the concept of a default locale, which is > > typically set for you at install time, and which every process you > > start inherits, unless you configure your user environment > > differently. > > > > Fortunately, there is a very simple mechanism by which this happens, > > which is the LANG environment variable. There are additional > > ancillary environment variables which start with "LC_*" but you > > usually should not have to set any of these, because they inherit > > their value from LANG if they are not explicitly set. When you run > > the locale command, values enclosed in quotes are inherited from LANG, > > and values NOT enclosed in quotes have been set explicitly: > > > > $ locale > > LANG=en_US.UTF-8 > > LANGUAGE= > > LC_CTYPE="en_US.UTF-8" > > LC_NUMERIC="en_US.UTF-8" > > LC_TIME="en_US.UTF-8" > > LC_COLLATE=C > > LC_MONETARY="en_US.UTF-8" > > LC_MESSAGES="en_US.UTF-8" > > LC_PAPER="en_US.UTF-8" > > LC_NAME="en_US.UTF-8" > > LC_ADDRESS="en_US.UTF-8" > > LC_TELEPHONE="en_US.UTF-8" > > LC_MEASUREMENT="en_US.UTF-8" > > LC_IDENTIFICATION="en_US.UTF-8" > > LC_ALL= > > > > Here you can see that I explicitly set LC_COLLATE=C. The rest are > > inherited from LANG. Typically most us
Re: Mutt showing ? in place of space
Thank you Derek! Bob Crochelt On Fri, Mar 29, 2024, at 13:47, Derek Martin wrote: > On Sat, Mar 23, 2024 at 07:41:45PM +0800, Sadeep Madurange wrote: >> Initially, LANG was unset and LC_CTYPE="C". The character encoding was >> US-ASCII. I changed these variables (i.e., LANG, LC_CTYPE and locale >> settings) to en_US.UTF-8. Then the ? changed to ?. So, looks like you >> are on to something. I will check this with OpenBSD community as well. >> >> In Xdefaults, I have set XTerm*utf-8 setting to true as well. > > Your problem is that these settings are not consistent (and you still > have this problem, because the "solution" proposed by Sirius is > incorrect--even if it appears to have solved your issue). By having > LANG unset, you've told your shell (and therefore everything started > by it) to use ASCII, but you've explicitly told xterm to use Unicode. > That's wrong. > > The TL;DR of this is: > > 1. You should NEVER need to set Mutt's charset explicitly. [*] > 2. Your shell, Mutt, and X should all inherit what they need from your >LANG environment variable, assuming it is set properly for your >system and environment (it definitely isn't in your case). > 3. Setting Mutt's charset may appear to "work" but it's not the >correct solution, because your shell and terminal settings are >still inconsistent. You'll have trouble with other things later if >you don't fix this. > > [*] Except in extremely rare and completely esoteric cases that apply > only to experts... and by now should really apply to no one. > > > The unfortunately lengthy details: > -- > > Displaying characters properly is actually tricky business on modern > computers, because of the legacy methods by which we tried to > accommodate different languages, and the (relatively) recent advent of > Unicode to unify that mess. All of the following must be set > consistently: Your shell, your terminal program (or your operating > system's console), your font, all of your application programs, and > when appropriate, the X window system. If any of these are not > consistently set, you can, and eventually WILL, have trouble. Most > modern systems have the concept of a default locale, which is > typically set for you at install time, and which every process you > start inherits, unless you configure your user environment > differently. > > Fortunately, there is a very simple mechanism by which this happens, > which is the LANG environment variable. There are additional > ancillary environment variables which start with "LC_*" but you > usually should not have to set any of these, because they inherit > their value from LANG if they are not explicitly set. When you run > the locale command, values enclosed in quotes are inherited from LANG, > and values NOT enclosed in quotes have been set explicitly: > > $ locale > LANG=en_US.UTF-8 > LANGUAGE= > LC_CTYPE="en_US.UTF-8" > LC_NUMERIC="en_US.UTF-8" > LC_TIME="en_US.UTF-8" > LC_COLLATE=C > LC_MONETARY="en_US.UTF-8" > LC_MESSAGES="en_US.UTF-8" > LC_PAPER="en_US.UTF-8" > LC_NAME="en_US.UTF-8" > LC_ADDRESS="en_US.UTF-8" > LC_TELEPHONE="en_US.UTF-8" > LC_MEASUREMENT="en_US.UTF-8" > LC_IDENTIFICATION="en_US.UTF-8" > LC_ALL= > > Here you can see that I explicitly set LC_COLLATE=C. The rest are > inherited from LANG. Typically most users will want to leave all of > the LC_* variables unset, and inherit from LANG. > > I haven't tried a *BSD in a really long while, but if it doesn't ask > you for your default locale during install, or if you made a mistake > setting it up, then you should add the settings manually to your login > shell environment. If you're using UTF-8 (which you should be--by now > every modern OS uses it by default), the value of LANG should reflect > that. Pretty much no one should be using ASCII anymore (i.e. LANG > should NEVER be unset). The most portable way to do that would be to > include the following in BOTH .profile and .kshrc (or whatever file > you've set ENV to): > > LANG=en_US.UTF-8 > export LANG > > [See the Invocation section of the ksh man page for exact details of > which files you should put this in, but in general it's the ones I > said.] > > Of course if you are not using English, change en_US to whatever your > default language is, but you'll want to retain the ".UTF-8" portion. > > That *should* be sufficient to handle everything... however, there may > be additional places you'll need to add it for your X applications, > depending on exactly what OpenBSD does to initialize users' X > sessions. In general, the X startup stuff is supposed to make sure > that it sources the user's environment so that you don't need to > figure out which of the 17 different files you actually need to put > this stuff in... but over the years most vendors have bastardized how > X sessions start u
Re: Mutt showing ? in place of space
On Sat, Mar 23, 2024 at 07:41:45PM +0800, Sadeep Madurange wrote: > Initially, LANG was unset and LC_CTYPE="C". The character encoding was > US-ASCII. I changed these variables (i.e., LANG, LC_CTYPE and locale > settings) to en_US.UTF-8. Then the ? changed to ?. So, looks like you > are on to something. I will check this with OpenBSD community as well. > > In Xdefaults, I have set XTerm*utf-8 setting to true as well. Your problem is that these settings are not consistent (and you still have this problem, because the "solution" proposed by Sirius is incorrect--even if it appears to have solved your issue). By having LANG unset, you've told your shell (and therefore everything started by it) to use ASCII, but you've explicitly told xterm to use Unicode. That's wrong. The TL;DR of this is: 1. You should NEVER need to set Mutt's charset explicitly. [*] 2. Your shell, Mutt, and X should all inherit what they need from your LANG environment variable, assuming it is set properly for your system and environment (it definitely isn't in your case). 3. Setting Mutt's charset may appear to "work" but it's not the correct solution, because your shell and terminal settings are still inconsistent. You'll have trouble with other things later if you don't fix this. [*] Except in extremely rare and completely esoteric cases that apply only to experts... and by now should really apply to no one. The unfortunately lengthy details: -- Displaying characters properly is actually tricky business on modern computers, because of the legacy methods by which we tried to accommodate different languages, and the (relatively) recent advent of Unicode to unify that mess. All of the following must be set consistently: Your shell, your terminal program (or your operating system's console), your font, all of your application programs, and when appropriate, the X window system. If any of these are not consistently set, you can, and eventually WILL, have trouble. Most modern systems have the concept of a default locale, which is typically set for you at install time, and which every process you start inherits, unless you configure your user environment differently. Fortunately, there is a very simple mechanism by which this happens, which is the LANG environment variable. There are additional ancillary environment variables which start with "LC_*" but you usually should not have to set any of these, because they inherit their value from LANG if they are not explicitly set. When you run the locale command, values enclosed in quotes are inherited from LANG, and values NOT enclosed in quotes have been set explicitly: $ locale LANG=en_US.UTF-8 LANGUAGE= LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE=C LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= Here you can see that I explicitly set LC_COLLATE=C. The rest are inherited from LANG. Typically most users will want to leave all of the LC_* variables unset, and inherit from LANG. I haven't tried a *BSD in a really long while, but if it doesn't ask you for your default locale during install, or if you made a mistake setting it up, then you should add the settings manually to your login shell environment. If you're using UTF-8 (which you should be--by now every modern OS uses it by default), the value of LANG should reflect that. Pretty much no one should be using ASCII anymore (i.e. LANG should NEVER be unset). The most portable way to do that would be to include the following in BOTH .profile and .kshrc (or whatever file you've set ENV to): LANG=en_US.UTF-8 export LANG [See the Invocation section of the ksh man page for exact details of which files you should put this in, but in general it's the ones I said.] Of course if you are not using English, change en_US to whatever your default language is, but you'll want to retain the ".UTF-8" portion. That *should* be sufficient to handle everything... however, there may be additional places you'll need to add it for your X applications, depending on exactly what OpenBSD does to initialize users' X sessions. In general, the X startup stuff is supposed to make sure that it sources the user's environment so that you don't need to figure out which of the 17 different files you actually need to put this stuff in... but over the years most vendors have bastardized how X sessions start up, so you may have to look up or trace out how your system does it to make sure everything works correctly. So you might have to also add it to .xinitrc or .xsession or similar. Or, if it does not already, you could simply have your X init thingy source your .profile