On Wed, May 21, 2008 at 12:47:57AM +0200, Agustin Martin wrote:
> On Tue, May 20, 2008 at 06:26:34PM +0200, Lionel Elie Mamane wrote:
>> What I *think* happens is that as
>> /var/lib/dictionaries-common/aspell/aspell-he is in UTF-8 and (data
>> out of it) comes at the head of
>> /v/cache/d-c/emacsen-ispell-dicts.el, it triggers emacs to treat
>> the whole /v/c/d-c/emacsen-ispell-dicts.el file as UTF-8; when it
>> encounters ISO-8859 data it thus keeps it as a binary stream of
>> octets rather then converting it to its internal representation
>> (mule?). But without aspell-he installed, it treats the file as
>> iso8859-1!
> I tend to think that emacs finds two non-compatible encodings and because of
> that reads the file as a data stream.
Ah, yes. Indeed:
(assoc "hebrew" debian-aspell-only-dictionary-alist)
("hebrew"
"[\327\220\327\221\327\222\327\223\327\224\327\225\327\226\327\227\327\230\327\231\327\233\327\234\327\236\327\235\327\240\327\237\327\241\327\242\327\244\327\243\327\246\327\245\327\247\327\250\327\251\327\252]"
"[^\327\220\327\221\327\222\327\223\327\224\327\225\327\226\327\227\327\230\327\231\327\233\327\234\327\236\327\235\327\240\327\237\327\241\327\242\327\244\327\243\327\246\327\245\327\247\327\250\327\251\327\252]"
"[-']" nil ("-d" "hebrew") nil iso-8859-8)
> I see three possibilities (again untested, and time to go to bed),
> 1) Make sure emacs loads the file as a data stream. Probably setting
> coding-system-for-read to the right value here will do the trick.
"raw-text" seems adequate. Or maybe "no-conversion" alias "binary".
The attached patch works for me.
> I think it is set to nil and I expected that to mean no-conversion,
> but probably just mean auto.
Yes:
If the value is a coding system, it is used for decoding on read operation.
If not, an appropriate element is used from one of the coding system alists:
There are three such tables, `file-coding-system-alist', (...)
and file-coding-system-alist contains a fallback:
("" undecided)
>> Compare also the result of:
>> (assoc "francais" debian-ispell-only-dictionary-alist)
>> (...) in the other case, I get
>> ("francais"
>> "[A-Za-z\300\302\307\310\311\312\313\316\317\324\331\333\334\274\340\342\347\350\351\352\353\356\357\364\371\373\374\275]"
>>
>> "[^A-Za-z\300\302\307\310\311\312\313\316\317\324\331\333\334\274\340\342\347\350\351\352\353\356\357\364\371\373\374\275]"
>> "[-']" t ("-d" "francais") "~list" iso-8859-15)
>> (so the 8-bit characters in octal-escaped form.)
> I guess the last worked. Did it?
Yes.
--
Lionel
--- 50dictionaries-common.el~ 2008-02-25 13:45:13.000000000 +0100
+++ 50dictionaries-common.el 2008-05-21 07:57:26.295827281 +0200
@@ -28,7 +28,8 @@
(if (not (file-exists-p "/usr/share/emacs/site-lisp/dictionaries-common/debian-ispell.el"))
(message "Info: Package dictionaries-common removed but not purged.")
(load "debian-ispell" t)
- (load debian-dict-entries t))
+ (let ((coding-system-for-read 'raw-text))
+ (load debian-dict-entries t)))
))
;;; Previous code for loading ispell.el and refreshing spell-checking