Re: texteditors that can process and save in different encodings

Doug Ewell Wed, 17 Oct 2012 22:14:55 -0700

Philippe Verdy wrote:

But the most basic converters between encodings (not syntax
transformers such as converting characters into escape sequences for
specific computer languages) should be integrated (this includes
standard UTF's, notably UTF-8 and probably UTF-16,


So far so good.

ASCII,


A strict subset of UTF-8, so no need to support this separately.

and most probably ISO-8859 1,

People outside of the Americas and Western Europe might disagree withthis "obvious" default SBCS choice.

and its Windows 1252 extension which replaces the deprecated C1
controls from ISO 8859, as agreed now in HTML5 and most common
practices ;

C1 controls are deprecated from HTML5, and probably from other versionsof HTML, and from XML. Even in 2012, other types of text files arerumored to exist. Until C1 controls are formally deprecated from ISO6429 and/or ECMA 48, it is incorrect to declare them "deprecated" ingeneral.

this should also include the integrated support for local encodings
that are already natively integrated in the OS for its legacy 8-bit
encoding, which should be supported by using local OS API's,

Step by step, this started with "the most basic converters" and hasevolved into something much more extensive. The .NET framework supportsdozens of non-Unicode encodings. Once you go down this path, users willreasonably expect your app to provide all kinds of character processing,like CRLF conversion and \Uxxxx conversion and trailing-space strippingand tab/space conversion and maybe normalization. This is the situationwe are in today.


--
Doug Ewell | Thornton, Colorado, USA

http://www.ewellic.org | @DougEwell

Re: texteditors that can process and save in different encodings

Reply via email to