On Thu, Mar 01, 2007 at 09:41:44AM +0100, Marcel Ruff wrote:
> 
> >>Are you thinking of Java's _modified_ version of UTF-8
> >>(http://en.wikipedia.org/wiki/UTF-8#Java)?
> >>    
> >
> >Uhg, disgusting...
> >  
> Yes - this is an open & serious issue for my approach!
> 
> Has anybody some practical advice on this?

Just treat the sequence c0 80 according to the spec, as an invalid
sequence. Neither it (because it's illegal utf-8) nor a real NUL
(because it's illegal in text) should appear. If your problem is more
specific and there's a real reason you need to handle such data
differently, please describe what you're doing so we can offer better
advice.

Rich

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to