On Thu, Dec 15 2005, David Hansen wrote:
> (prefer-coding-system 'latin-1)
> (prefer-coding-system 'latin-9)
> (prefer-coding-system 'windows-1252)
> (prefer-coding-system 'utf-8)
I'd expect that the latin-1 line _after_ windows-1252 doesn't make
sense. Any file that can possibly be encoded with Latin-1 can also be
encodes using windows-1252 (proper superset). So Emacs will never
choose Latin-1, I think. Probably the same argument holds for
Latin-9, but I'm not completely sure (does windows-1252 contain _all_
chars from Latin-9?).
Of course UTF-8 also covers Latin-* and windows-1252, but iso-8859*
encoded files are not valid UTF-8 files. And valid UTF-8 files with
multi-byte characters are not valid iso-8859 files. Thus Emacs (or
file(1)) is able to distinguish UTF-8 from iso-8859*.
[ Coming back to Ralf's question: ]
On Wed, Dec 14 2005, Ralf Angeli wrote:
> would it be possible for Emacs to figure out the right coding system
> by itself in the case at hand? That means without me having to
> specify coding systems explicitely by means of preferred coding
> system options, coding cookies, or `C-x RET c' and similar.
No. A program cannot distinguish iso-8859-1 from iso-8859-2 or -15
reliably. Same for windows-1252 vs. windows-1258 (0x80 in your
example file). Heuristic approaches[1] might be possible, though.
Bye, Reiner.
[1] There was a discussion about this in the German newsreader group
on this, see the monster thread starting with
<news:[EMAIL PROTECTED]>).
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
_______________________________________________
emacs-pretest-bug mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug