My initial plan for finding out about the current locale is that the program will, at start up, look at the LC_CTYPE environment variable. If that variable is defined and contains the substring "UTF-8" or regex-able
After calling setlocale(LC_ALL, ""), you should use nl_langinfo(CODESET) when/where available. **Only** if it's not available, you have to resort to examining LC_ALL, LC_CTYPE, and LANG env. variables in that order.
> xterm didn't handle Indic or RTL scripts)).
It doesn't yet (but it does support Thai and Hangul Conjoining Jamos along with up-to-two diacritical marks for Latin/Cyrillic/Greek alphabets). You may try mlterm for Indic scripts. See also Markus Kuhn's FAQ on Linux and Unicode (google will get you right there.)
Jungshik Shin

