I'm playing around with GNU Emacs 21.2.1 as it comes with Red Hat 8
(emacs-21.2-18.rpm).

Emacs automatically recognizes UTF-8 files and it is trivial to switch
to a suitable *-iso10646-1 font to get it displayed properly. Very nice
so far.

However, I have big problems with entering non-ASCII characters.
Whenever I press keys to which I have assigned with xmodmap non-ASCII
keysyms such as

  adiaeresis, leftsinglequotemark, leftdoublequotemark, ...

then Emacs inserts into the buffer a sequence of two or three characters
that look like the UTF-8 byte sequences for the entered characters, but
interpreted as ISO 8859-1 bytes. So pressing � causes ä to be inserted
into the buffer. Also, cut&paste into an xterm in UTF-8 mode is broken.
When I select the � in the buffer and paste it into UTF-8 xterm, I get
it again expanded to ¤.

Environment variables: LANG=en_GB.UTF-8
Leim is not installed -> no input method used.

So it seems like Xlib is correctly converting my keysyms to UTF-8
sequences (as the locale says it should do), but then Emacs
missinterprets these as ISO 8859-1 bytes.

When I press Ctrl-h C Return, I get the following display:

------------------------------------------------------------------------------
Coding system for saving this buffer:
  Not set locally, use the default.
Default coding system (for new files):
  u -- mule-utf-8 (alias: utf-8)
Coding system for keyboard input:
  u -- utf-8 (alias of mule-utf-8)
Coding system for terminal output:
  u -- utf-8 (alias of mule-utf-8)
Defaults for subprocess I/O:
  decoding: u -- mule-utf-8 (alias: utf-8)
  encoding: u -- mule-utf-8 (alias: utf-8)

Priority order for recognizing coding systems when reading files:
  1. mule-utf-8 (alias: utf-8)
  2. iso-latin-1 (alias: iso-8859-1 latin-1)
  3. iso-2022-jp (alias: junet)
  4. iso-2022-7bit 
  5. iso-2022-7bit-lock (alias: iso-2022-int-1)
  6. iso-2022-8bit-ss2 
  7. emacs-mule 
  8. raw-text 
  9. japanese-shift-jis (alias: shift_jis sjis)
  10. chinese-big5 (alias: big5 cn-big5)
  11. no-conversion (alias: binary)

  Other coding systems cannot be distinguished automatically
  from these, and therefore cannot be recognized automatically
  with the present coding system priorities.

  The followings are decoded correctly but recognized as iso-2022-7bit-lock:
    iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext
    iso-2022-jp-2 iso-2022-kr

Particular coding systems specified for certain file names:

  OPERATION     TARGET PATTERN          CODING SYSTEM(s)
  ---------     --------------          ----------------
  File I/O      "\\.po[tx]?\\'\\|\\.po\\."
                                        po-find-file-coding-system
                "\\.elc\\'"             (emacs-mule . emacs-mule)
                "\\(\\`\\|/\\)loaddefs.el\\'"
                                        (raw-text . raw-text-unix)
                "\\.tar\\'"             (no-conversion . no-conversion)
                ""                      (undecided)
  Process I/O   nothing specified
  Network I/O   nothing specified
------------------------------------------------------------------------------

When I start emacs in the C locale, and press adiaeresis, then the �'s
appear separated by a space in the minibuffer line, as if I had pressed
Alt-digit.

Any ideas on what is wrong and how to fix this?
What Emacs 21.2 ever tested under a UTF-8 locale?

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to