"Jordan Brown (Sun)" wrote:
> Roland Mainz wrote:
> > ... if I interpret the situation correctly you're output an UTF-8
> > encoding string, right ? If that's "true" then there is a _serious_
> > problem since such a value would be an invalid charatcer sequence for
> > non-UTF-8 multibyte encodings. You may get away in some shells like
> > ksh93, but only by "accident" because one of the implementation details
> > of ksh93 is that it treats all things as plain strings unless it needs
> > to do special handling like quotes, IFS etc. In that case the shell
> > script will break because you hit invalid charatcers... which is AFAIK
> > bad... ;-(
> 
> Note that the design of UTF-8 is such that "plain ASCII" values 00-7F
> always represent the plain ASCII characters.  Non-ASCII characters,
> including all of the bytes of multibyte sequences, are always in the
> range 80-FF.
> 
> That largely protects UTF-8 strings from misinterpretation by
> applications that only understand ASCII.

Yes, but _if_ the application is multibyte-aware then it will
malfunction since things like the internal |mbtowc()|-translation will
fail at the first point where an invalid byte sequence is hit (e.g. the
strings or input streams will be cut off at that point. In some
encodings the shell may recover but in others any further processing may
be impossible).

The only workaround would be to say that shell scripts which want to
process output of /usr/bin/svcprop must be run in the "C" locale (e.g.
LC_ALL=C) but that somehow defeats the concept of supporting non-ASCII
characters in the whole SMF stuff...

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz at nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)

Reply via email to