On Sun, May 22, 2005 at 06:09:49AM -0700, Simon Michael wrote: > I'm not sure where this problem lies, or if it's a bug. I'll give > specifics in case someone can help: > > I am recording a patch in an emacs shell buffer. I copy the author's > name (Jo�o with an a tilde) from thunderbird and paste into emacs. It > looks correct in both places; both are unicode-aware and using utf-8 I > believe. > > Then, > > - darcs changes in the emacs shell buffer shows Jo\343o > > - if I copy/paste that to thunderbird, I see a tilde again > > - darcs changes in gnome terminal shows a box character > > - darcs.cgi shows Jo\e3o (it used to die; thanks, whoever fixed it) > > Any ideas ?
The � gets recorded in Latin-1 encoding (hex code e3). The UTF-8 encoding would be the two octets ã (hex codes c3 a3). So your emacs and thunderbird seems to use Latin1, but your gnome terminal seems to use UTF-8 and therefore displays the "illegal" UTF-8 octet sequence e3 as a box character. Darcs does not automatically display non-ASCII chars, but escapes them (like \e3 in the darcs.cgi case). You have to set some DARCS_??? environment variables (depending on version of darcs) to enable non-ASCII output, and it seems to be set correctly in your emacs shell but not in the environment of darcs.cgi. I think copy/paste in X11 can sometimes play tricks with the encoding too, but the output of darcs.cgi clearly shows that the � was recorded in Latin1. -- Tommy Pettersson <[EMAIL PROTECTED]> _______________________________________________ darcs-users mailing list [email protected] http://www.abridgegame.org/mailman/listinfo/darcs-users
