"Jordan Brown (Sun)" wrote: > Roland Mainz wrote: > > ... if I interpret the situation correctly you're output an UTF-8 > > encoding string, right ? If that's "true" then there is a _serious_ > > problem since such a value would be an invalid charatcer sequence for > > non-UTF-8 multibyte encodings. You may get away in some shells like > > ksh93, but only by "accident" because one of the implementation details > > of ksh93 is that it treats all things as plain strings unless it needs > > to do special handling like quotes, IFS etc. In that case the shell > > script will break because you hit invalid charatcers... which is AFAIK > > bad... ;-( > > Note that the design of UTF-8 is such that "plain ASCII" values 00-7F > always represent the plain ASCII characters. Non-ASCII characters, > including all of the bytes of multibyte sequences, are always in the > range 80-FF. > > That largely protects UTF-8 strings from misinterpretation by > applications that only understand ASCII.
Yes, but _if_ the application is multibyte-aware then it will malfunction since things like the internal |mbtowc()|-translation will fail at the first point where an invalid byte sequence is hit (e.g. the strings or input streams will be cut off at that point. In some encodings the shell may recover but in others any further processing may be impossible). The only workaround would be to say that shell scripts which want to process output of /usr/bin/svcprop must be run in the "C" locale (e.g. LC_ALL=C) but that somehow defeats the concept of supporting non-ASCII characters in the whole SMF stuff... ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;)