Hi, I have a question that I was sort of sad that I couldn't readily find the answer to...
Let's say I want to create a C API (a C library), with functions which take strings as arguments. What am I supposed to use if I want these strings to be in any language? Obviously the answer is "Unicode", but that doesn't really answer the question... How is Unicode used in C? As far as I can see, there are two major approaches to this problem. One approach, used in the Win32 C APIs on MS-Windows, and also in Java and other languages, is to use "wide characters" - characters of 16 or 32 bit size, and strings are an array of such characters. The second approach, proposed by Plan 9, is to use UTF-8. I personally like better the UTF-8 approach, because it naturally fits with C's "char *" type and with Linux's system calls (which take char*, not any sort of wide characters), but I'm completely unsure that this is what users actually want. If not, then I wonder, why? Some background on this question: People have been complaining for years that Hspell, and in particular the libhspell functions, use ISO-8859-8 instead of "unicode". But if one wants to add unicode to libhspell, what should it be? UTF-8? Wide chars (UTF-16 or UTF-32)? Thanks, Nadav. -- Nadav Har'El | Monday, Mar 12 2012, n...@math.technion.ac.il |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |We could wipe out world hunger if we knew http://nadav.harel.org.il |how to make AOL's Free CD's edible! _______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il