Martijn Dekker dixit: >So it looks like multibyte character support is not activated when >commands are executed with -c.
Right. We just had this discussion a few weeks ago. Not a bug. For the purpose of POSIX, mksh operates in the "C" locale, and anything else is implementation-defined behaviour. mksh does not track the LANG and LC_* variables, except at startup, an interactive shell may do that, depending on the compilation settings. Read “man mksh”, section CAVEATS. Right at the end, there is: For the purpose of POSIX, mksh supports only the "C" locale. For users of UTF-8 locales, the following sh code makes the shell match the locale: case ${KSH_VERSION:-} in *MIRBSD KSH*|*LEGACY KSH*) case ${LC_ALL:-${LC_CTYPE:-${LANG:-}}} in *[Uu][Tt][Ff]8*|*[Uu][Tt][Ff]-8*) set -U ;; *) set +U ;; esac ;; esac Short form, if you know you’re running mksh already: set -U; [[ ${LC_ALL:-${LC_CTYPE:-${LANG:-}}} = *[Uu][Tt][Ff]?(-)8* ]] || set +U The basic idea behind this is: on most OSes, the scripts do not explicitly export LC_ALL=C at startup, yet assume this from historical tradition. Enabling UTF-8 mode for scripts (and -c is a scriptlet) would break too much. bye, //mirabilos -- “It is inappropriate to require that a time represented as seconds since the Epoch precisely represent the number of seconds between the referenced time and the Epoch.” -- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2