On 12/11/2013 11:42 AM, Ken Moffat wrote: > On Wed, Dec 11, 2013 at 12:46:09AM -0500, Alan Feuerbacher wrote: >> >> LC_ALL=en_US locale charmap -> ISO-8859-1 >> LC_ALL=en_US.iso88591 locale charmap -> ISO-8859-1 >> LC_ALL=en_US.utf8 locale charmap -> UTF-8 >> >> So far as I understand, US English installations work with either of the >> above charmap settings. >> >> Can someone explain the difference? > > So long as you use _only_ ASCII characters or the few symbols and > accented letters offered in it, ISO-8859-1 works fine. Once people > start using UTF-8 (like in my .sig), things break down. > > If you look at iso-8859-1 on wikipedia it will show you the limited > range of glyphs / codepoints it supports. What that page *doesn't* > mention is the encoding. For that, look at the UTF-8 page if you > are interested in the messy details. The point is that ANY latin-1 > (ISO-8859-1) character with a value greater than 0x7F is represented > by a single byte. > > However, when I send you the same character in UTF-8 it will occupy > more than one byte. For example, the copyright sign is 0x00A9 - in > UTF-8 that becomes 0xC2 0xA9 [ © ] if I've read the UTF-8 wiki page > correctly. > >> And what I should set in the Samba >> smb.conf file for "unix charset"? >> > If you have ISO-8859-1 data in the files offered by Samba, then I > guess you need to use 8859-1. Otherwise, use UTF-8. Windows has > supported UTF-8 for a long time.
How can I tell if I have "ISO-8859-1 data in the files offered by Samba"? As I understand it, Samba is a general file server, so in general it should handle all manner of files; hence I should use UTF-8, no? Alan -- http://linuxfromscratch.org/mailman/listinfo/blfs-support FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page
