The best advice you can get is to steer clear of wide characters. You should never need to use any wide character functions. Keep the data in your program internally represented as utf-8. The standard byte-oriented "strlen", "strcpy", "strstr", "printf" etc work fine with utf-8.
XML uses utf-8 by default as well, so little if any conversion between encodings should be needed. You may have to convert your input from a legacy encoding to utf-8, or you could just externally convert using something such as this: cat inputfile | iconv -t utf-8 | myprogram Being "unicode aware" is trivial in this fashion. 2007/4/16, Ali Majdzadeh <[EMAIL PROTECTED]>:
Hello All Sorry, if my questions are elementary. As I know, the size of wchar_t data type (glibc), is compiler and platform dependent. What is the best practice of writing portable Unicode-aware C programs? Is it a good practice to use Unicode literals directly in a C program? I have experienced some problems with glibc's wide character string functions, I want to know is there any standard way of programming or standard template to write a Unicode-aware C program? By the way, my native language is Persian. I am working on a C program which reads a Persian text file, parses it and generates an XML document. For this, there exist lots of issues that need the use of library functions (eg. wcscpy(), wcsstr(), wcscmp(), fgetws(), wfprintf(), ...), and, as I mentioned earlier, I have experienced some odd problems using them. (eg. wcsstr() never succeeds in matching two wchar_t * Persian strings.) Best Regards Ali
-- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
