Re: Default character encoding for each operating system?

2016-09-15 Thread Philippe Verdy
Not all internals. Many kernel drivers (notably bus drivers) still use an
OEM 8 bit encoding in their debugging log (based on an US English locale
most often even if the installed version if localized to another version;
but I've seen CP850 still used; and you can see some samples in the Event
Viewer). Those messages in fact are not localized at all and intended only
for debugging or analysis by developers, or displayed on a Windows console.

Many console tools on Windows still use the default 8-bit OEM charset and
won't display any Unicode output, even when the console is set to use an
Unicode codepage: I can still see some mojibake, even on Windows 10). When
those ouput messages are read from other UI tools, they won't be
interpreted in their codepage but in the default "ANSI" codepage (such as
Windows1252).

Filesystems still use legacy charsets in their basic directory structure
(e.g. when inserting a FAT or FAT32 volume, formated without the LFN
extensions for Windows which also stores filenames in UTF-16, such as a SD
card formatted on a digital camera; as the directories and filenames create
on those devices only use ASCII and uninformative names such as
IMG1.JPG this generally does not cause a problem; but no Unicode name
is stored; I've seen however some digital cameras storing some filenames in
a legacy Chinese or Japanese charset, incorrectly rendered when viewing
their content on a non-Japanese/Chinese system).

2016-09-15 16:36 GMT+02:00 John W Kennedy :

> macOS, and its offspring, iOS, watchOS, and tvOS, use UTF-16LE for all
> internals, but readily import and export all versions of Unicode and a good
> many historic 8-bit and mixed-length codings.
>
> In the new Swift programming language, which is white-hot in the Apple
> community, Apple is moving toward a model of a transparent, generic Unicode
> that can be “viewed” as UTF-8, UTF-16, or UTF-32 if necessary, but in which
> a “character” contains however many code points it needs (“e” with a
> stacked macron, acute accent, and dieresis is algorithmically one
> “character” in Swift). Moreover, e-with-an-acute-accent and e followed by a
> combining acute accent, for example, compare as equal. At present, the
> underlying code is still UTF-16LE.
>
> --
> SKen Software, LLC
> Coming soon to an iPhone near you
>
> On Sep 15, 2016, at 9:19 AM, Philippe Verdy  wrote:
>
> A better question is what is the default character encoding for the
> **installed** operating system.
>
> Unfortunately it has no single response, because there are several default
> encodings for several parts of the OS. An OS has lots of components, many
> of them don't are transparent to the encoding it uses.
>
> All the 3 OSes you cite support several default character encodings, and
> in addition they support them in several encoding forms. All three support
> Unicode internally, but not in all software components. that will run with
> one or the other.
>
> And defaults will change according to your distribution or OS
> configuration options, and to your own current user settings
>
> 2016-09-15 13:14 GMT+02:00 Costello, Roger L. :
>
>> Hi Folks,
>>
>> In a book that I am reading [1] the author mentions “the default
>> character encoding for the operating system.” What is the default character
>> encoding of:
>>
>> -  Windows 10
>>
>> -  Mac OS
>>
>> -  Linux
>>
>>
>> /Roger
>>
>> [1] *Practical Common Lisp* by Peter Seibel, p. 165 (footnote 2).
>>
>
>


Re: Default character encoding for each operating system?

2016-09-15 Thread John W Kennedy
macOS, and its offspring, iOS, watchOS, and tvOS, use UTF-16LE for all 
internals, but readily import and export all versions of Unicode and a good 
many historic 8-bit and mixed-length codings. 

In the new Swift programming language, which is white-hot in the Apple 
community, Apple is moving toward a model of a transparent, generic Unicode 
that can be “viewed” as UTF-8, UTF-16, or UTF-32 if necessary, but in which a 
“character” contains however many code points it needs (“e” with a stacked 
macron, acute accent, and dieresis is algorithmically one “character” in 
Swift). Moreover, e-with-an-acute-accent and e followed by a combining acute 
accent, for example, compare as equal. At present, the underlying code is still 
UTF-16LE.

-- 
SKen Software, LLC
Coming soon to an iPhone near you

> On Sep 15, 2016, at 9:19 AM, Philippe Verdy  wrote:
> 
> A better question is what is the default character encoding for the 
> **installed** operating system.
> 
> Unfortunately it has no single response, because there are several default 
> encodings for several parts of the OS. An OS has lots of components, many of 
> them don't are transparent to the encoding it uses.
> 
> All the 3 OSes you cite support several default character encodings, and in 
> addition they support them in several encoding forms. All three support 
> Unicode internally, but not in all software components. that will run with 
> one or the other.
> 
> And defaults will change according to your distribution or OS configuration 
> options, and to your own current user settings
> 
> 2016-09-15 13:14 GMT+02:00 Costello, Roger L. :
>> Hi Folks,
>> 
>> In a book that I am reading [1] the author mentions “the default character 
>> encoding for the operating system.” What is the default character encoding 
>> of:
>> 
>> -  Windows 10
>> 
>> -  Mac OS
>> 
>> -  Linux
>> 
>> 
>> /Roger
>> 
>> [1] Practical Common Lisp by Peter Seibel, p. 165 (footnote 2).
>> 
> 


Re: Default character encoding for each operating system?

2016-09-15 Thread Philippe Verdy
A better question is what is the default character encoding for the
**installed** operating system.

Unfortunately it has no single response, because there are several default
encodings for several parts of the OS. An OS has lots of components, many
of them don't are transparent to the encoding it uses.

All the 3 OSes you cite support several default character encodings, and in
addition they support them in several encoding forms. All three support
Unicode internally, but not in all software components. that will run with
one or the other.

And defaults will change according to your distribution or OS configuration
options, and to your own current user settings

2016-09-15 13:14 GMT+02:00 Costello, Roger L. :

> Hi Folks,
>
> In a book that I am reading [1] the author mentions “the default character
> encoding for the operating system.” What is the default character encoding
> of:
>
> -  Windows 10
>
> -  Mac OS
>
> -  Linux
>
>
> /Roger
>
> [1] *Practical Common Lisp* by Peter Seibel, p. 165 (footnote 2).
>


Re: Default character encoding for each operating system?

2016-09-15 Thread David Starner
Linux is far less specific than Windows 10. In all recent versions of
Debian GNU/Linux, UTF-8 is the most common character encoding, but it is
still supported to use ISO-8859-x or I believe even something like EUC-JP.
Other distributions may enforce UTF-8 or in rare cases ISO 8859-1 or even
something else.

On Thu, Sep 15, 2016, 4:18 AM Costello, Roger L.  wrote:

> Hi Folks,
>
> In a book that I am reading [1] the author mentions “the default character
> encoding for the operating system.” What is the default character encoding
> of:
>
> -  Windows 10
>
> -  Mac OS
>
> -  Linux
>
>
> /Roger
>
> [1] *Practical Common Lisp* by Peter Seibel, p. 165 (footnote 2).
>