Re: latin1 words in an utf-8 file

Christian Ebert Sat, 23 Sep 2006 03:45:57 -0700

Hi Tony,

* A.J.Mechelynck on Saturday, September 23, 2006 at 09:57:40 +0200:
> Christian Ebert wrote:
>> Is it possible to have eg. iso-8859-1 encoded words/passages in
>> an otherwise utf-8 encoded file? I mean, w/o automatic
                                          without
>> conversion, and I don't need the iso passages displayed in a
>> readable way, but so I can still write the file in utf-8 w/o
>> changing the "invalid" iso-8859-1 chars?
>> 
>> Hm, hope I made myself clear.


Hm, I probably didn't.

<snip detailed explanation with bleeding heart ;)>

> Corollary of the conclusion:
> 
> #1.
> cat file1.utf8.txt file2.latin1.txt file3.utf8.txt > file99.utf8.txt
> 
> will produce invalid output unless the Latin1 input file is actually 7-bit 
> US-ASCII. This is not a limitation of the "cat" program (which inherently 
> never translates anything) but a false manoeuver on the part of the user.

Hm, I want illegal stuff, hehe.

> #2.
> gvim
>       :if &tenc == "" | let &tenc = &enc | endif
>       :set enc=utf-8 fencs=utf-bom,utf-8,latin1
                             ucs-bom
>       :e ++enc=utf-8 file1.utf8.txt
>       :$r ++enc=latin1 file2.latin1.txt
>       :$r ++enc=utf-8 file3.utf-8.txt
>       :saveas file99.utf8.txt

Then file99.utf8.txt is the same as the one produced with the
cat command. Which is actually what I want.

*But*:

Vim insists on converting the displayed text to latin1. What I
want is to have the contents displayed in utf-8 with a few
illegal characters in latin1.

Now I get:

#v+
VÃ¶gel <- utf-8

Vögel  <- latin1
#v-

because Vim automatically converts to latin1. Whereas I'd like to
have it the other way round: with "Vögel" displayed as garbage,
but I can continue editing the file in _utf-8_.

Is this possible in *G*vim? (I don't have the GUI installed)

Example snippet from a fictitious LaTeX-file to show the purpose
(or to increase confusion):

#v+
The main part of the file is in utf-8 encoding and contains
non-ascii characters.

Then, I want to typeset, say, one word in \emph{spaced} small
caps. The \LaTeX-package that does this, is not capable to parse
utf-8 input, so this single word has to be in latin1 in case it
contains non-ascii chars:

\begingroup\inputencoding{latin1}
\caps{V?gel}
\endgroup

to use the above example (with ``?'' for garbage).

\caps{V\"ogel} gives orthographically correct output but I lose
the kerning of the font.

The above example \emph{works,} but the main part of the file is
displayed in ``disected'' utf-chars.

Is it possible to have this the other way round without automatic
conversion to latin1?
#v-

c
-- 
_B A U S T E L L E N_ lesen! --->> <http://www.blacktrash.org/baustellen.html>

Re: latin1 words in an utf-8 file

Reply via email to