On Wed, May 21, 2008 at 12:47:57AM +0200, Agustin Martin wrote:
> On Tue, May 20, 2008 at 06:26:34PM +0200, Lionel Elie Mamane wrote:

>> What I *think* happens is that as
>> /var/lib/dictionaries-common/aspell/aspell-he is in UTF-8 and (data
>> out of it) comes at the head of
>> /v/cache/d-c/emacsen-ispell-dicts.el, it triggers emacs to treat
>> the whole /v/c/d-c/emacsen-ispell-dicts.el file as UTF-8; when it
>> encounters ISO-8859 data it thus keeps it as a binary stream of
>> octets rather then converting it to its internal representation
>> (mule?). But without aspell-he installed, it treats the file as
>> iso8859-1!

> I tend to think that emacs finds two non-compatible encodings and because of
> that reads the file as a data stream.

Ah, yes. Indeed:

(assoc "hebrew" debian-aspell-only-dictionary-alist)
("hebrew" 
"[\327\220\327\221\327\222\327\223\327\224\327\225\327\226\327\227\327\230\327\231\327\233\327\234\327\236\327\235\327\240\327\237\327\241\327\242\327\244\327\243\327\246\327\245\327\247\327\250\327\251\327\252]"
 
"[^\327\220\327\221\327\222\327\223\327\224\327\225\327\226\327\227\327\230\327\231\327\233\327\234\327\236\327\235\327\240\327\237\327\241\327\242\327\244\327\243\327\246\327\245\327\247\327\250\327\251\327\252]"
 "[-']" nil ("-d" "hebrew") nil iso-8859-8)

> I see three possibilities (again untested, and time to go to bed),

> 1) Make sure emacs loads the file as a data stream. Probably setting
>    coding-system-for-read to the right value here will do the trick.

"raw-text" seems adequate. Or maybe "no-conversion" alias "binary".

The attached patch works for me.

> I think it is set to nil and I expected that to mean no-conversion,
> but probably just mean auto.

Yes:

 If the value is a coding system, it is used for decoding on read operation.
 If not, an appropriate element is used from one of the coding system alists:
 There are three such tables, `file-coding-system-alist', (...)

and file-coding-system-alist contains a fallback:

 ("" undecided)

>> Compare also the result of:

>>  (assoc "francais" debian-ispell-only-dictionary-alist)

>> (...) in the other case, I get

>>  ("francais" 
>> "[A-Za-z\300\302\307\310\311\312\313\316\317\324\331\333\334\274\340\342\347\350\351\352\353\356\357\364\371\373\374\275]"
>>  
>> "[^A-Za-z\300\302\307\310\311\312\313\316\317\324\331\333\334\274\340\342\347\350\351\352\353\356\357\364\371\373\374\275]"
>>  "[-']" t ("-d" "francais") "~list" iso-8859-15)

>> (so the 8-bit characters in octal-escaped form.)

> I guess the last worked. Did it?

Yes.

-- 
Lionel
--- 50dictionaries-common.el~	2008-02-25 13:45:13.000000000 +0100
+++ 50dictionaries-common.el	2008-05-21 07:57:26.295827281 +0200
@@ -28,7 +28,8 @@
     (if (not (file-exists-p "/usr/share/emacs/site-lisp/dictionaries-common/debian-ispell.el"))
 	(message "Info: Package dictionaries-common removed but not purged.")
       (load "debian-ispell" t)
-      (load debian-dict-entries t))
+      (let ((coding-system-for-read 'raw-text))
+	(load debian-dict-entries t)))
     ))
 
 ;;; Previous code for loading ispell.el and refreshing spell-checking

Reply via email to