Re: [PROPOSAL] new 'systemencoding' option.

kikuchan Thu, 14 Oct 2010 10:30:06 -0700

Thank you all for your reply.

Bram wrote:
> It makes sense to me.  For Windows we indeed have code to convert
> between 'encoding' and UTF-16, which is supported by the library
> functions.  On Unix there is no such translation.  I actually do not
> know what Unix standard specifies what encoding file names are in.  It
> might depend on the mounted drive, in which case it may differ per
> device.

Yes, it may differ per device on unix-like systems.
This is happen when a user mount a USB pen drive formatted on Windows.

But modern unix-like systems have a translation layer for filesystem encoding.
# This is done by kernel.

So usually user mounts the pen drive as a utf-8 encoded filesystem,
regardless of the actual filename encoding on the pen drive.

But some users, including me, still use non-unicode filesystem encoding
mainly, because of backward compatibilities.

Bram wrote:
> I have not seen this subject before.

and, Benjamin wrote:
> Can you describe your system setup?  Trying to create a test case for
> this, I made a tiny vfat filesystem, but then found that the mount
> options for 'VFAT' state that long filenames are stored in Unicode (I
> believe as UTF-16).  Short filenames depend on a codepage, but the
> utility of a feature like this, just to fix 8.3 filenames seems dubious.

I think this problem is not happen on Windows, because of os_win32.c.
But I'm not sure.

There is no standard encoding for file names on unix-like systems.
You can use any of characters, other than '/' and NUL, are accepted
for file names, even if it's non-printable control code.

You can reproduce this problem with following steps.
# and I assume you are on unix-like systems
----------
# First, launch (old, no unicode enabled) xterm (or whatever. or console?).

# Creates a file with latin1 filename (a copyright mark).
  % touch `printf '\0251'`

# Execute Vim with latin1
  % vim -u NONE --cmd 'set encoding=latin1' `printf '\0251'`

# You can see a copyright mark filename, as expected.
# Then, let's write some text with latin1.
#  e.g. just type: i<C-v>xa9<ESC>:wq<CR>

# Execute Vim with utf-8, but imagine the user still use latin1 terminal.
  % vim -u NONE --cmd 'set encoding=utf-8' \
                --cmd 'set termencoding=latin1' `printf '\0251'`

  # or, just use gVim with 'encoding=utf-8'

  % gvim -u NONE --cmd 'set encoding=utf-8' `printf '\0251'`

# Then, you can see proper file contents, as expected.
# But the filename in status line, is mess (non-printable <a9>).
----------

This happens ALL CJK users who are using non-unicode filesystem with
'encoding=utf-8'.

Benjamin wrote:
> I think 'filenameencoding' (though long) would be a better name.
> 'systemencoding' sounds like what the current 'encoding' option does.
> ("system" to me implies the computing environment, not the filesystem.)

The new proposal option 'systemencoding' that I named because
this translation is also needed for executing a shell.

For example (on Vim with 'encoding=utf-8' and 'termecoding=latin1')
 :w [latin1_filename]  # This should be latin1 filename on filesystem too.
 :e [latin1_filename]  # ditto
 :!echo 'some_message' > [latin1_filename]  # ditto
 :r! cat -n [latin1_filename]  # ditto

# The above 'some_message' may be latin1 too ;)

If there is no encoding translation support for shell execution,
:w and :e works fine with latin1, but it doesn't work for :!echo and :r!.

This makes the user be confused, especially when using completion
on ex-command line.

Furthermore, the user was using a shell with latin1 before executing Vim.
Executing "echo 'some_message' > [latin1_filename]" on the shell,
create a file with latin1 filename, and latin1 message in it.

But if there is no encoding translation support for shell execution,
the user get different results on Vim ex-command line.

So, by this translation, let's make sure the user will get the same results
when executing the same command on Vim ex-command line.

The 'some_message' is treated as latin1 with this translation by side-effect.
But, this is very natural, isn't it?

Yue wrote:
> I strongly support this feature, cjk people needs it, very much.

Thanks!

P.S.
The attached patch adds the thin translation layer described above.
The translation is done everytime when Vim trying to interact with filesystem,
or Vim trying to execute a shell command.

# I'm sorry a previous version of patch have a double-free bug...

Bram, could you include the patch for beta test?
This translation won't be enabled unless the 'systemencoding' option is set.
# Currently, this may work only on unix-like systems.

Thanks in advance.

Best Regards,
Kikuchan

-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

vim-systemencoding.patch
Description: vim-systemencoding.patch

Re: [PROPOSAL] new 'systemencoding' option.

Raspunde prin e-mail lui