On Jun 3 12:02, Christopher Faylor wrote: > On Wed, Jun 03, 2009 at 04:27:55PM +0200, Corinna Vinschen wrote: > >On Jun 3 09:18, Edward Lam wrote: > >> Corinna Vinschen wrote: > >>> The question is, what do you expect? [...] > >> [...] > >> Wikipedia has several suggestions on how to handle invalid UTF-8 byte > >> sequences (http://en.wikipedia.org/wiki/UTF-8). Personally, I favor the > >> rule that uses the replacement character. > > > >Chris implemented using the invalid code point solution. The discussion > >in http://www.mail-archive.com/linux-u...@nl.linux.org/msg00080.html > >supports this solution. What's missing so far is the way back, from > >an invalid single second half of a surrogate pair in the 0xDCxx range > >back to the correct byte value. I'm just looking into that. > > The way back was not, AFAIK, needed for Cygwin programs. I don't think > there is a valid way back for Windows programs.
The way back is not needed for the argv handling in Cygwin, but it gets necessary if you converted to UTF-16 in other circumstances. It's not much of a problem since the way back is a no-brainer, in contrast to the conversion to UTF-16. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/