On Sun, May 22, 2005 at 06:09:49AM -0700, Simon Michael wrote:
> I'm not sure where this problem lies, or if it's a bug. I'll give 
> specifics in case someone can help:
> 
> I am recording a patch in an emacs shell buffer. I copy the author's 
> name (Jo�o with an a tilde) from thunderbird and paste into emacs. It 
> looks correct in both places; both are unicode-aware and using utf-8 I 
> believe.
> 
> Then,
> 
> - darcs changes in the emacs shell buffer shows Jo\343o
> 
> - if I copy/paste that to thunderbird, I see a tilde again
> 
> - darcs changes in gnome terminal shows a box character
> 
> - darcs.cgi shows Jo\e3o (it used to die; thanks, whoever fixed it)
> 
> Any ideas ?

The � gets recorded in Latin-1 encoding (hex code e3).
The UTF-8 encoding would be the two octets ã (hex codes
c3 a3).

So your emacs and thunderbird seems to use Latin1, but your
gnome terminal seems to use UTF-8 and therefore displays the
"illegal" UTF-8 octet sequence e3 as a box character.

Darcs does not automatically display non-ASCII chars, but
escapes them (like \e3 in the darcs.cgi case).  You have
to set some DARCS_??? environment variables (depending on
version of darcs) to enable non-ASCII output, and it seems to
be set correctly in your emacs shell but not in the environment
of darcs.cgi.

I think copy/paste in X11 can sometimes play tricks with the
encoding too, but the output of darcs.cgi clearly shows that
the � was recorded in Latin1.


-- 
Tommy Pettersson <[EMAIL PROTECTED]>

_______________________________________________
darcs-users mailing list
[email protected]
http://www.abridgegame.org/mailman/listinfo/darcs-users

Reply via email to