Roland Mainz wrote:
> ... if I interpret the situation correctly you're output an UTF-8
> encoding string, right ? If that's "true" then there is a _serious_
> problem since such a value would be an invalid charatcer sequence for
> non-UTF-8 multibyte encodings. You may get away in some shells like
> ksh93, but only by "accident" because one of the implementation details
> of ksh93 is that it treats all things as plain strings unless it needs
> to do special handling like quotes, IFS etc. In that case the shell
> script will break because you hit invalid charatcers... which is AFAIK
> bad... ;-(

Note that the design of UTF-8 is such that "plain ASCII" values 00-7F
always represent the plain ASCII characters.  Non-ASCII characters,
including all of the bytes of multibyte sequences, are always in the
range 80-FF.

That largely protects UTF-8 strings from misinterpretation by
applications that only understand ASCII.


Reply via email to