date:20090313

Re: [RFC] Default 'encoding' to UTF-8

2009-03-13 Fir de Conversatie Mike Williams


Matt Wozniski wrote:
 On Mon, Mar 2, 2009 at 8:40 PM, James Vega wrote:
 With Vim's current behavior, 'encoding' is derived from the environment
 and 'fileencoding'/'termencoding' derive from 'encoding' (modulo
 'fileencodings' affect on 'fenc').  This seems sub-optimal for various
 reasons.

 1) Vim is using an internal encoding derived from the environment which
   may or may not be able to represent the different file encodings
   encountered when editing various files.
 2) The encoding Vim uses for interpreting input from the user and
   determining how to display to the user is not directly derived from
   the user's environment.
 3) File encoding detection ('fencs') defaults to a value that is
   unlikely to correctly work with most interesting (non-ascii) files.

 Defaulting 'enc' to UTF-8 helps address these problems.

 1) This is now a non-issue as Vim can internally represent all
   characters by converting them to their unicode counterpart.
 2) This can be addressed by making 'tenc' derive its value from the
   environment instead of from 'enc', which is more in line with the
   behavior implied by the name.
 3) File encoding detection now has a sane default value which means new
   users are less likely to encounter problems when editing files of
   various encodings.

 This change would also allow eliminating 'encoding' as an option or,
 less drastic, disallowing changing 'enc' once the startup files have
 been sourced.

 Changing 'enc' in a running Vim session is a very common mistake to new
 Vim users that are trying to get their file written out in a specific
 encoding or editing a file that's not in their environment's encoding.
 
 Yeah.  We regularly see people in #vim who don't realize that they
 should be changing 'fenc' instead of 'enc', and I've seen it come up
 on vim-use a few times as well...
 
 The help already states that changing 'enc' in a running session is a
 bad idea, and I know from experience that it can cause Vim to crash[0].
 Taking the next logical step and preventing users from doing that
 (unless someone can provide a compelling reason to continue allowing it)
 makes sense and helps prevent potential data loss.
 
 This sounds like a very good idea to me.  I don't know of any other
 programs that allow you to change encoding used internally, and we
 would be in good company if we chose to always use a unicode encoding
 internally: Java uses UTF-16 internally, and I believe python does as
 well.  Is there any time when it would be desirable to use a
 non-unicode 'encoding' (assuming, of course, that +multi_byte is
 available)?  I can't think of any.

Yes, editing very large (say a few 100MB) data files that in a single 
byte encoding.  For my day job I regularly enjoy having to spelunk my 
way around large files containing a mix of readable ASCII and binary 
data.  Using a Unicode encoding could make this prohibitive.  Yes, this 
is essentially a raw file edit mode, perhaps that should be an option - 
or would it be part of setting binary mode?

TTFN

Mike
-- 
I am not young enough to know everything.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---

Re: [RFC] Default 'encoding' to UTF-8

2009-03-13 Fir de Conversatie Matt Wozniski


On Fri, Mar 13, 2009 at 12:01 PM, Mike Williams wrote:

 Matt Wozniski wrote:
 This sounds like a very good idea to me.  I don't know of any other
 programs that allow you to change encoding used internally, and we
 would be in good company if we chose to always use a unicode encoding
 internally: Java uses UTF-16 internally, and I believe python does as
 well.  Is there any time when it would be desirable to use a
 non-unicode 'encoding' (assuming, of course, that +multi_byte is
 available)?  I can't think of any.

 Yes, editing very large (say a few 100MB) data files that in a single
 byte encoding.  For my day job I regularly enjoy having to spelunk my
 way around large files containing a mix of readable ASCII and binary
 data.  Using a Unicode encoding could make this prohibitive.  Yes, this
 is essentially a raw file edit mode, perhaps that should be an option -
 or would it be part of setting binary mode?

How would using Unicode for 'enc' in any way affect this?  Sure, you'd
want to use a single-byte 'fenc', but no one is suggesting that the
'fenc' option should be removed.  If there is a reason why editing
binary files should be affected at all by what encoding the editor
uses for storing the buffer text internally, I don't see it and you'll
need to elaborate.

~Matt

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---

Re: [RFC] Default 'encoding' to UTF-8

2009-03-13 Fir de Conversatie Mike Williams


Matt Wozniski wrote:
 On Fri, Mar 13, 2009 at 12:01 PM, Mike Williams wrote:
 Matt Wozniski wrote:
 This sounds like a very good idea to me.  I don't know of any other
 programs that allow you to change encoding used internally, and we
 would be in good company if we chose to always use a unicode encoding
 internally: Java uses UTF-16 internally, and I believe python does as
 well.  Is there any time when it would be desirable to use a
 non-unicode 'encoding' (assuming, of course, that +multi_byte is
 available)?  I can't think of any.
 Yes, editing very large (say a few 100MB) data files that in a single
 byte encoding.  For my day job I regularly enjoy having to spelunk my
 way around large files containing a mix of readable ASCII and binary
 data.  Using a Unicode encoding could make this prohibitive.  Yes, this
 is essentially a raw file edit mode, perhaps that should be an option -
 or would it be part of setting binary mode?
 
 How would using Unicode for 'enc' in any way affect this?  Sure, you'd
 want to use a single-byte 'fenc', but no one is suggesting that the
 'fenc' option should be removed.  If there is a reason why editing
 binary files should be affected at all by what encoding the editor
 uses for storing the buffer text internally, I don't see it and you'll
 need to elaborate.

With a UTF-16 internal encoding a 250MB data file blossoms into a nice 
round 500MB.  For all the cheap memory these days this will still have 
an effect on system performance - time to allocate, paging out of idle 
apps to disk, etc.

And will VIM internally use a canonical Unicode form?  What happens if I 
want to insert some 8-bit data whose unicode character has multiple 
forms?  Which one is used?  How will I know that the 8-bit value I 
intend does not appear as composed sequence?  I haven't used VIM for 
editing unicode with composing characters (damn my native english 
country) - I see there is some discussion on composing but a first 
glance it is not clear whether it is automatic or not.  In my case I 
would not want deletion of data byte to result in other bytes to deleted 
as well.

At the moment I cannot see how supporting Unicode semantics maps to 
editing binary data files.  Not saying it is impossible, I'd just like 
to see the possible way out of the woods if we did go this way.

TTFN

Mike
-- 
Imagination is more important than knowledge.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---

Re: variable tabstops feature branch in vim_extended.git

2009-03-13 Fir de Conversatie Matthew Winn


On Thu, 12 Mar 2009 23:38:44 +0100, Markus Heidelberg
markus.heidelb...@web.de wrote:

 patch 7.2.137 has rewritten half of the function shift_block() in
 src/ops.c. This is the part, which was heavily modified by the variable
 tabstops patch, so that I ended up with a merge conflict. I will at
 first remove the feature from the branch, so that I can update the
 master branch.
 
 If someone who actually uses this feature or Matthew himself would
 update the patch, I would be eager to include it again.

I've been meaning to work on the patch but ill health has intervened.
I'll see if I can find the time to bring the patch up to date.

-- 
Matthew Winn

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---

Re: [PATCH] support for the bang in :diffthis (was Re: [PATCH] :diffoff should not change settings for non-diff windows)

2009-03-13 Fir de Conversatie Markus Heidelberg


John Beckett, 09.03.2009:
 
 Bram Moolenaar wrote:
  It's perhaps a bit strange to use :diffthis! to start diff
  mode in other windows.  :diffall would be more obvious.
  It's not symmetric with :diffoff vs :diffoff!, but that
  one doesn't say this.
  
  What do you all think about using :diffall instead?
 
 There is already quite a bit of history in Vim, so I would prefer to
 NOT introduce a new command.

If :diffthis wouldn't yet exist, I'd definetly prefer :diffon[!] and
:diffoff[!]. But now I also prefer :diffthis!. A new command :diffall
is probably a bit confusing, since we would end up with 3 commands
(this|off|all) for 2 (or 4) actions (diff on/off).

I just though, if we treat :diffthis as diff this window and
:diffthis! as diff this tab, then the this in the latter command
doesn't seem so linguistic strange any more.

 The following looks logical to me:
 
   Go to window 1
  :diffthis
   Go to window 2
  :diffthis
 
   Remove diff from all visible windows
  :diffoff!
 
   Apply diff to all visible (normal) windows
  :diffthis!

This is my preferred solution, too.

Markus


--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---

Re: [RFC] Default 'encoding' to UTF-8

Re: [RFC] Default 'encoding' to UTF-8

Re: [RFC] Default 'encoding' to UTF-8

Re: variable tabstops feature branch in vim_extended.git

Re: [PATCH] support for the bang in :diffthis (was Re: [PATCH] :diffoff should not change settings for non-diff windows)

5 matches

Site Navigation

Mail list logo

Footer information