Re: [gentoo-user] vim encoding

2007-08-15 Thread Willie Wong
On Tue, Aug 14, 2007 at 08:38:38AM +0200, Penguin Lover Michal 'vorner' Vaner 
squawked:
 Hello
 
 On Mon, Aug 13, 2007 at 10:42:17PM +0100, Mick wrote:
  Hmm, I just checked a utf-8 file after I edited it and it says:
  
  :set encoding
   encoding=latin1
 
 I would guess your UTF-8 file has no accents, or other characters. In
 other words, it can be considered pure ASCII, which means Vim can safely
 assume it is latin1 encoded text - there is no difference no matter
 which reasonable encoding it chooses. (The encoding is not saved in the
 file, it is guessed from what is saved there)

Uh, that's not exactly correct.

encoding is the value assumed from LANG that the buffer uses: it is
the encoding of the terminal you are using, and how you enter text and
how the buffer should be displayed on the screen. 

fileencoding (:help fenc) is what determines how the files should be
saved to disk. It is automatically guessed by vim based on the file
in question and the variable fileencodings. 

Only when vim cannot guess what encoding the file originally was (in
which case fenc is the empty string), or when you start editing a new
file without specifying fenc, will vim actually convert and save the
file using the encoding specified in the enc variable. Given that the
default vim install on gentoo has utf8 among the fileencodings
options, vim should correctly detect a utf8 file as such, and even
though you are editing it in a latin1 environment, convert it back to
utf8 on save. 

W
-- 
Statistics are like a Bikini: 
  showing interesting details but hiding the important stuff.
Sortir en Pantoufles: up 250 days, 19:22
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] vim encoding

2007-08-14 Thread Michal 'vorner' Vaner
Hello

On Mon, Aug 13, 2007 at 10:42:17PM +0100, Mick wrote:
 Hmm, I just checked a utf-8 file after I edited it and it says:
 
 :set encoding
  encoding=latin1

I would guess your UTF-8 file has no accents, or other characters. In
other words, it can be considered pure ASCII, which means Vim can safely
assume it is latin1 encoded text - there is no difference no matter
which reasonable encoding it chooses. (The encoding is not saved in the
file, it is guessed from what is saved there)

-- 
do { goto Water; } while( !tryBreakOffTheEar() );

Michal 'vorner' Vaner


pgp5cJ0nrWDsC.pgp
Description: PGP signature


Re: [gentoo-user] vim encoding

2007-08-14 Thread Mick
On 13/08/07, Benno Schulenberg [EMAIL PROTECTED] wrote:
 Mick wrote:
  Hmm, I just checked a utf-8 file after I edited it and it says:
 
  :set encoding
   encoding=latin1
 
  I assume this means that it was changed from utf8 to latin1

 No.  To see what encoding a file has, you could use 'file'.  Run
 'file thefileyouedited', and it should say UTF-8 Unicode text.
 When you open the file again with vim, it will say [converted] on
 the status line: converted from utf-8 to latin1.

That's what I did first of all, but got this in return:

$ file Website/sitemap.xml
Website/sitemap.xml: XML


 Because your LANG isn't set, the default is Latin1, as you could
 have learned by typing ':help enc' in vim.

Thanks for the tip.  The depths of Vim's help pages never cease to amaze me.

  Do I have to change my locale?

 Depends on what you want.  If new files should be UTF-8 encoded,
 then change your locale.  Otherwise you're fine as you are.

Hmm, not sure:  I want to be able to edit text files, which may be
occasionally just ascii, or MS Windows, or UTF-8, without creating any
problems when these are opened at the receiving end.  The above file
was a UTF-8 file before I edited it and Google will spit it out if the
encoding has been changed.

If I understand the autoconversion feature correctly, vim will
open/convert it; then I will edit it in which ever language my locale
is meant to have been set up (latin1); then it will convert/save it
back in UTF-8 . . . unless I have entered in my editing any funny
characters(?).  I believe that in most cases this should be
satisfactory.

Alternatively, I will use e.g.  :set enc utf-8 in Vim and carry on
merrily editing the txt file.

Or, I leave Vim encoding alone and run export LANG=en_GB.UTF-8 and
Vim will use that.

Did I get this right?

PS.  What I am not entirely sure about is where is the locale set for
my system?  When I look into /etc/env.d/ I cannot find the file
02locale.
-- 
Regards,
Mick
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] vim encoding

2007-08-14 Thread Mick
On 14/08/07, Michal 'vorner' Vaner [EMAIL PROTECTED] wrote:
 Hello

 On Mon, Aug 13, 2007 at 10:42:17PM +0100, Mick wrote:
  Hmm, I just checked a utf-8 file after I edited it and it says:
 
  :set encoding
   encoding=latin1

 I would guess your UTF-8 file has no accents, or other characters. In
 other words, it can be considered pure ASCII, which means Vim can safely
 assume it is latin1 encoded text - there is no difference no matter
 which reasonable encoding it chooses. (The encoding is not saved in the
 file, it is guessed from what is saved there)

Thank you, this makes it clear.
-- 
Regards,
Mick
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] vim encoding

2007-08-14 Thread Benno Schulenberg
Mick wrote:
 Or, I leave Vim encoding alone and run export LANG=en_GB.UTF-8
 and Vim will use that.

 Did I get this right?

Precisely.  But why don't you just try it and see how it behaves?

 PS.  What I am not entirely sure about is where is the locale set
 for my system?

When it's not set anywhere, it defaults to POSIX.

 When I look into /etc/env.d/ I cannot find the file 02locale.

You have to make that file yourself.  If you google for 02locale 
on site:gentoo.org, you will find these:

http://www.gentoo.org/doc/en/guide-localization.xml
http://www.gentoo.org/doc/en/utf-8.xml

Benno
-- 
[EMAIL PROTECTED] mailing list



[gentoo-user] vim encoding

2007-08-13 Thread Mick
Hi All,

I am trying to find out how I can see what encoding my vim is using.  Also 
would be good to know how to set it to a different encoding, if I need to.

Some other questions that may help me understand how encoding works:

 - My /etc/vim/vimrc says scriptencoding utf-8, does this mean that this is 
the vim encoding and any new file will be saved with this encoding?

 - If I open a file which was saved with ISO-8859-1, edit it and save it, will 
it keep the original encoding?

 - The vimrc says:  Make sure we have a sane fallback for encoding detection
set fileencodings+=default  I guess this is system default.  How can I find 
what is the default setting?

Sorry if these are simple questions but never got my head around encodings and 
charactersets.
-- 
Regards,
Mick


pgpehDbXZmuee.pgp
Description: PGP signature


Re: [gentoo-user] vim encoding

2007-08-13 Thread Steffen Loos
Mick schrieb:
 Hi All,
 
 I am trying to find out how I can see what encoding my vim is using.  Also 
 would be good to know how to set it to a different encoding, if I need to.
:set encoding
should show you the encoding resently used.

also you can set another encoding with, e.a.:
:set encoding=utf-8


please see also
:help encoding

Steffen
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] vim encoding

2007-08-13 Thread Benno Schulenberg
Mick wrote:
  - My /etc/vim/vimrc says scriptencoding utf-8, does this mean
 that this is the vim encoding and any new file will be saved with
 this encoding?

No, scriptencoding is just the encoding of /etc/vim/vimrc.  File 
encoding is handled by 'fileencodings' further down.

  - If I open a file which was saved with ISO-8859-1, edit it and
 save it, will it keep the original encoding?

Yes.  Vim will never change the encoding of a file.  If you typed 
characters that don't fit in ISO-8859-1 (which can happen if you use 
a utf locale), you will get a CONVERSION ERROR upon writing the 
file, and 'quit' will refuse to quit without you forcing it.

  - The vimrc says:  Make sure we have a sane fallback for
 encoding detection set fileencodings+=default  I guess this is
 system default.  How can I find what is the default setting?

Open a new file and type :set enc; it will show the current 
(default) enconding.  This default is what LANG says.  See the 
output of 'locale'.

Benno
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] vim encoding

2007-08-13 Thread Mick
On 13/08/07, Benno Schulenberg [EMAIL PROTECTED] wrote:
 Mick wrote:
   - My /etc/vim/vimrc says scriptencoding utf-8, does this mean
  that this is the vim encoding and any new file will be saved with
  this encoding?

 No, scriptencoding is just the encoding of /etc/vim/vimrc.  File
 encoding is handled by 'fileencodings' further down.

   - If I open a file which was saved with ISO-8859-1, edit it and
  save it, will it keep the original encoding?

 Yes.  Vim will never change the encoding of a file.  If you typed
 characters that don't fit in ISO-8859-1 (which can happen if you use
 a utf locale), you will get a CONVERSION ERROR upon writing the
 file, and 'quit' will refuse to quit without you forcing it.

Hmm, I just checked a utf-8 file after I edited it and it says:

:set encoding
 encoding=latin1

I assume this means that it was changed from utf8 to latin1 (what ever
this is . . . is it relevant to ISO-8859-1?)

   - The vimrc says:  Make sure we have a sane fallback for
  encoding detection set fileencodings+=default  I guess this is
  system default.  How can I find what is the default setting?

 Open a new file and type :set enc; it will show the current
 (default) enconding.  This default is what LANG says.  See the
 output of 'locale'.

The output of locale gives me:
===
$ locale
LANG=
LC_CTYPE=POSIX
LC_NUMERIC=POSIX
LC_TIME=POSIX
LC_COLLATE=POSIX
LC_MONETARY=POSIX
LC_MESSAGES=POSIX
LC_PAPER=POSIX
LC_NAME=POSIX
LC_ADDRESS=POSIX
LC_TELEPHONE=POSIX
LC_MEASUREMENT=POSIX
LC_IDENTIFICATION=POSIX
LC_ALL=
===

Not sure I understand what all this means.  Is my Vim installation
working as it should? Do I have to change my locale?
-- 
Regards,
Mick
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] vim encoding

2007-08-13 Thread Benno Schulenberg
Mick wrote:
 Hmm, I just checked a utf-8 file after I edited it and it says:

 :set encoding
  encoding=latin1

 I assume this means that it was changed from utf8 to latin1

No.  To see what encoding a file has, you could use 'file'.  Run 
'file thefileyouedited', and it should say UTF-8 Unicode text.  
When you open the file again with vim, it will say [converted] on 
the status line: converted from utf-8 to latin1.

 (whatever this is . . . is it relevant to ISO-8859-1?)

Latin1 is a synonym for ISO-8859-1.

Because your LANG isn't set, the default is Latin1, as you could 
have learned by typing ':help enc' in vim.

 Not sure I understand what all this means.  Is my Vim
 installation working as it should?

You can edit any file you like, vim will auto-convert on read and 
write.

 Do I have to change my locale? 

Depends on what you want.  If new files should be UTF-8 encoded, 
then change your locale.  Otherwise you're fine as you are.

Benno
-- 
[EMAIL PROTECTED] mailing list