Hi Werner,
It looks like I have found what's wrong with cjk-enc.el :-). Here is the
important part in cjk-encode - I got this to work with both emacs 22 and emacs
23, and *just for Thai-tis620*, okay:
=============================================
@@ -590,6 +622,12 @@
(message "Decomposing...")
(decompose-region (point-min) (point-max))))
+ ;; (message "supported charset-list %s" charset-list) ;; debug code
+ (cond ((> emacs-major-version 22)
+ ;; emacs 23+ has charset-priority-list, set-charset-priority.
+ ;; and unicode always comes first, before everything else if not
overridden.
+ ;; (message "priority %s" (charset-priority-list))
+ (set-charset-priority 'thai-tis620)))
(let ((enc nil)
(space-state nil)
prev-charset charset
@@ -606,7 +644,7 @@
(set-buffer work-buf)
;; Set CHARSET to the character set of the current character.
- (setq charset (char-charset ch))
+ (setq charset (char-charset ch)) ;; char-charset can accept an
optional
list to search
(if (eq charset 'ascii)
;; Not a multibyte character.
(progn
=======================================
The problem, in a nutshell, is that "char-charset" starts to return "unicode"
for every non-ascii character in emacs 23, over "chinese-big5-1", etc.
("charset-list" has expanded and nearly doubled in size).
I can see two ways out - either you set the charset priorities and put
thai/chinese/korean first, before identifying bytes as unicode, or limit the
"char-charset" search to a list of your choice and excluding "unicode" among
the
list, in the 2nd chunk above (and probably rework a little bit of the logic
further below in cjk-encode).
I left some debug code above for you to experiment with, to come up with the
correct list in either the first solution or the 2nd solution outlined. I tried
copying the list from cjk-format-spec-table or cjk-encode but neither works
very
well (I think iso8859-x probably shouldn't be there, or in a different order)
so
I think I'll stop - for now, it working for Thai is good enough for me.
Oh just for some geek credit, if you feel like mentioning my name
(e.g. http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8108), that's fine by me :-).
(my sourceforge address still works and is the preferred address for any public
listings...)
Hin-Tak
Werner LEMBERG wrote:
<snipped>
> --- cjk-enc.el 2011-09-03 21:59:11.000000000 +0200
> +++ cjk-enc.el.new 2011-12-01 08:33:38.000000000 +0100
> @@ -549,11 +549,43 @@
> "Coding-system for LaTeX2e CJK Package"
> '(mnemonic "CJK"
> pre-write-conversion cjk-encode))
> - (make-coding-system
> - 'cjk-coding 0 ?c
> - "Coding-system for LaTeX2e CJK Package"
> - nil
> - '((pre-write-conversion . cjk-encode))))
> + (if (< emacs-major-version 23)
> + (make-coding-system
> + 'cjk-coding 0 ?c
> + "Coding-system for LaTeX2e CJK Package"
> + nil
> + '((pre-write-conversion . cjk-encode)))
> + (define-coding-system
> + 'cjk-coding
> + "Coding-system for LaTeX2e CJK Package"
> + :mnemonic ?c
> + :coding-type 'emacs-mule
> + :charset-list '(ascii
> + latin-iso8859-1
> + latin-iso8859-2
> + latin-iso8859-3
> + latin-iso8859-4
> + cyrillic-iso8859-5
> + greek-iso8859-7
> + thai-tis620
> + vietnamese-viscii-lower
> + vietnamese-viscii-upper
> + latin-jisx0201
> + katakana-jisx0201
> + japanese-jisx0208
> + japanese-jisx0212
> + korean-ksc5601
> + chinese-gb2312
> + chinese-big5-1
> + chinese-big5-2
> + chinese-cns11643-1
> + chinese-cns11643-2
> + chinese-cns11643-3
> + chinese-cns11643-4
> + chinese-cns11643-5
> + chinese-cns11643-6
> + chinese-cns11643-7)
> + :pre-write-conversion 'cjk-encode)))
>
>
> ;; XEmacs doesn't have set-buffer-multibyte.
>
_______________________________________________
Cjk maillist - [email protected]
https://lists.ffii.org/mailman/listinfo/cjk