I'm playing around with GNU Emacs 21.2.1 as it comes with Red Hat 8
(emacs-21.2-18.rpm).
Emacs automatically recognizes UTF-8 files and it is trivial to switch
to a suitable *-iso10646-1 font to get it displayed properly. Very nice
so far.
However, I have big problems with entering non-ASCII characters.
Whenever I press keys to which I have assigned with xmodmap non-ASCII
keysyms such as
adiaeresis, leftsinglequotemark, leftdoublequotemark, ...
then Emacs inserts into the buffer a sequence of two or three characters
that look like the UTF-8 byte sequences for the entered characters, but
interpreted as ISO 8859-1 bytes. So pressing � causes ä to be inserted
into the buffer. Also, cut&paste into an xterm in UTF-8 mode is broken.
When I select the � in the buffer and paste it into UTF-8 xterm, I get
it again expanded to ¤.
Environment variables: LANG=en_GB.UTF-8
Leim is not installed -> no input method used.
So it seems like Xlib is correctly converting my keysyms to UTF-8
sequences (as the locale says it should do), but then Emacs
missinterprets these as ISO 8859-1 bytes.
When I press Ctrl-h C Return, I get the following display:
------------------------------------------------------------------------------
Coding system for saving this buffer:
Not set locally, use the default.
Default coding system (for new files):
u -- mule-utf-8 (alias: utf-8)
Coding system for keyboard input:
u -- utf-8 (alias of mule-utf-8)
Coding system for terminal output:
u -- utf-8 (alias of mule-utf-8)
Defaults for subprocess I/O:
decoding: u -- mule-utf-8 (alias: utf-8)
encoding: u -- mule-utf-8 (alias: utf-8)
Priority order for recognizing coding systems when reading files:
1. mule-utf-8 (alias: utf-8)
2. iso-latin-1 (alias: iso-8859-1 latin-1)
3. iso-2022-jp (alias: junet)
4. iso-2022-7bit
5. iso-2022-7bit-lock (alias: iso-2022-int-1)
6. iso-2022-8bit-ss2
7. emacs-mule
8. raw-text
9. japanese-shift-jis (alias: shift_jis sjis)
10. chinese-big5 (alias: big5 cn-big5)
11. no-conversion (alias: binary)
Other coding systems cannot be distinguished automatically
from these, and therefore cannot be recognized automatically
with the present coding system priorities.
The followings are decoded correctly but recognized as iso-2022-7bit-lock:
iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext
iso-2022-jp-2 iso-2022-kr
Particular coding systems specified for certain file names:
OPERATION TARGET PATTERN CODING SYSTEM(s)
--------- -------------- ----------------
File I/O "\\.po[tx]?\\'\\|\\.po\\."
po-find-file-coding-system
"\\.elc\\'" (emacs-mule . emacs-mule)
"\\(\\`\\|/\\)loaddefs.el\\'"
(raw-text . raw-text-unix)
"\\.tar\\'" (no-conversion . no-conversion)
"" (undecided)
Process I/O nothing specified
Network I/O nothing specified
------------------------------------------------------------------------------
When I start emacs in the C locale, and press adiaeresis, then the �'s
appear separated by a space in the minibuffer line, as if I had pressed
Alt-digit.
Any ideas on what is wrong and how to fix this?
What Emacs 21.2 ever tested under a UTF-8 locale?
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/