On Thu, Dec 15 2005, David Hansen wrote:

> (prefer-coding-system 'latin-1)
> (prefer-coding-system 'latin-9)
> (prefer-coding-system 'windows-1252)
> (prefer-coding-system 'utf-8)

I'd expect that the latin-1 line _after_ windows-1252 doesn't make
sense.  Any file that can possibly be encoded with Latin-1 can also be
encodes using windows-1252 (proper superset).  So Emacs will never
choose Latin-1, I think.  Probably the same argument holds for
Latin-9, but I'm not completely sure (does windows-1252 contain _all_
chars from Latin-9?).

Of course UTF-8 also covers Latin-* and windows-1252, but iso-8859*
encoded files are not valid UTF-8 files.  And valid UTF-8 files with
multi-byte characters are not valid iso-8859 files.  Thus Emacs (or
file(1)) is able to distinguish UTF-8 from iso-8859*.

[ Coming back to Ralf's question: ]
On Wed, Dec 14 2005, Ralf Angeli wrote:
> would it be possible for Emacs to figure out the right coding system
> by itself in the case at hand?  That means without me having to
> specify coding systems explicitely by means of preferred coding
> system options, coding cookies, or `C-x RET c' and similar.

No.  A program cannot distinguish iso-8859-1 from iso-8859-2 or -15
reliably.  Same for windows-1252 vs. windows-1258 (0x80 in your
example file).  Heuristic approaches[1] might be possible, though.

Bye, Reiner.

[1] There was a discussion about this in the German newsreader group
    on this, see the monster thread starting with
    <news:[EMAIL PROTECTED]>).
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/



_______________________________________________
emacs-pretest-bug mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug

Reply via email to