Felix Winkelmann scripsit: > Well, actually we might as well support several: ASCII/Latin-1, UTF-8 > and UCS-2/UCS-4. Without UTF-8 it would just be a variable > element-size option. But I agree that this doesn't make maintenance > any easier... Let's think some more about this. We don't have to > decide right now.
UCS-2 is obsolete; it would need to be UTF-16 (i.e. support of surrogates). In any case, Alex's point about the FFI is strong. Even on Windows, UTF-8 is coming to be the dominant way to talk to C programs, and it's part of the spirit of Chicken (IIUC) that talking to C is clean and easy. On Posix systems, UTF-8 is massively dominant. Similarly, on the Web, UTF-8 encodes a huge majority of all Web pages. As of early 2012, UTF-8 (including pure ASCII) was at 80% (see <http://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html>), and <http://w3techs.com/technologies/overview/character_encoding/all> shows it still rising. These figures aren't comparable, because Google is using its whole index and the *effective* encoding, whereas W3Techs is using only a large subset (10 million sites, usually only page per site) and the declared encoding (HTTP header, HTML meta, etc.) Still, both reports are loud and clear that UTF-8 is winning. Not having to transcode web pages most of the time is a win too. -- John Cowan http://www.ccil.org/~cowan [email protected] Why are well-meaning Westerners so concerned that the opening of a Colonel Sanders in Beijing means the end of Chinese culture? [...] We have had Chinese restaurants in America for over a century, and it hasn't made us Chinese. On the contrary, we obliged the Chinese to invent chop suey. --Marshall Sahlins _______________________________________________ Chicken-hackers mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/chicken-hackers
