Hi Bram, On 2/27/07, Bram Moolenaar <[EMAIL PROTECTED]> wrote:
Yongwei Wu wrote: > On 2/27/07, Bram Moolenaar <[EMAIL PROTECTED]> wrote: > > > > If I understand it correctly is GB18030 a multi-byte character set that > > is mostly the same as cp936, but adds a number of 4-byte characters. > > Vim does not support those 4-byte characters, thus setting 'encoding' to > > gb18030 won't work. > > > > But conversion between gb18030 and utf-8 should work, thus when > > 'encoding' is utf-8 it should be possible to use gb18030 in > > 'fileencodings' and 'fileencoding'. Perhaps you can check if that > > works. > > No, with Patch 58 Vim regards gb18030 as an alias for cp936, and > gb18030 does not work at all: this is the major problem. Please be specific: What do you mean with "does not work at all"?
As I said, Vim regards gb18030 as cp936 after Patch 58. I.e., "e ++enc=gb18030" is equivalent to "e ++enc=cp936" now. One cannot correctly open in Vim a file encoded in GB18030, and the 4-byte encoded characters will not be correct.
Since Vim doesn't support gb18030 internally and only Unicode has all the characters, I guess it would only work to edit these files when 'encoding' is "utf-8".
That depends on the purpose. If one just set GB18030 because it is the default value, setting encoding to cp936 works most of the time. In some edge cases if only a subset of GB18030 characters are used, other encodings may work well too. However, to support GB18030 properly, I believe UTF-8 as 'encoding' is the rightful choice. I have been using encoding=utf-8 for very long now.
However, if gb18030 is used in the environment that means that console output needs to be converted, thus 'termencoding' also needs to be set.
Not if encoding==gbk. According to the discussion, Edward would want to alias GB18030 to GBK in the environment, and in that case, encoding will be GBK by default, and all characters output will be in the range of GBK, so no conversion is needed. If encoding is (manually set to) utf-8 while the environment is GB18030, the hack Edward uses has no effect at all (which, I believe, is to make sure Vim will by default get an encoding 'CP936' instead of 'Latin1', so Chinese can be processed correctly), and one would need to manually set tenc to gb18030 anyhow. If I were to choose between not supporting GB18030 text properly or not supporting locale zh_CN.GB18030 properly, I would choose the latter. Yet another "solution" is move the gb18030 line down to the UNIX-specific part (say, l.392 in mbyte.c). It will be better than now, but still a hack and can surprise people. Best regards, Yongwei -- Wu Yongwei URL: http://wyw.dcweb.cn/
