Jürgen Krämer wrote: > with 'encoding' set to "utf-8" there is a quite confusing (to me) > difference between the column number and my expectations (supported by > the virtual column number) if there are non-ASCII characters on the > line. I don't know what the intended meaning of "column count" and the > intended behaviour of "cursor()" are, but it seems they both depend on > the size of the encoded characters. I always thought "nth column" was > more or less a synonym for "nth character on a line" while "nth virtual > column" meant "nth cell on a screen line". > > Here is how to reproduce the observed behaviour. Start > > vim -u NONE -U NONE > > and > > :set encoding=utf-8 > :set laststatus=2 > :set statusline=[%c/%v] > > (The last line tells VIM to display the column and the virtual column.) > Now enter two lines > > abc > äbc > > (The first letter in the second line is a lower case "A" with umlaut.) > While moving the cursor over the different characters on the first line > the status line shows "[1/1]", "[2/2]", and "[3/3]", respectively, > telling you that "column" and "virtual column" are equal. That is the > expected behaviour as long as there are no special characters like tabs > and non-printable characters. > > Now move the cursor over the characters in the second line. While the > cursor is over the "ä" "[1/1]" is displayed, but the next characters > result in "[3/2]" and "[4/3]", respectively. It seems as if "ä" (or any > non-ASCII character, for that matter) is accounting for (at least) two > columns while encoding is set to "utf-8". Although I know that "ä" is > represented by two bytes in UTF-8 encoding, I find this behaviour > irritating because on the surface it's only one character. It even gets > worse (IMHO) with characters that need three bytes in UTF-8 encoding, > like LATIN CAPITAL LETTER A WITH DOT BELOW (0x1EA0), which increase the > column number by three. > > Also the "cursor()" function shows this kind of interpretation of > non-ASCII characters. Both > > call cursor(2, 1) > > and > > call cursor(2, 2) > > place the cursor on "ä". To place it on "b" you need to > > call cursor(2, 3) > > although I would expect that already the second example would place the > cursor on "b". > > I can think of two ways to circumvent this problem: > > 1) switching to "encoding=latin1", which is not always an option > because of the need for characters outside the scope of latin1; > > 2) using only virtual column numbers in the status line, but this > gives different results when characters like tab or non-printables > are displayed in more than one screen cell (which is of course > reasonable). > > I don't know whether the shown behaviour is a bug or just a feature I > don't like, but in summary I think "column number" should really > represent a character count (i.e, corresponding to what the user sees), > not a byte count depending on the underlying encoding. > > I have seen this behaviour in VIM 6.2, 6.3, 6.4, and 7.0, so changing > the code will definitely introduce an incompatibility. So the final > question is: What do you (Vimmers) and you (Bram) think: is there a need > for a change.
I don't know why you call this a column count, in most places it's called a byte count. Perhaps in some places in the docs the remark about this actually being a byte count is missing. You could also want a character count. But what is a character when using composing characters? E.g., when the umlaut is not included in a character but added as a separate composing character? It's not so obvious what to do. In these situations I rather keep it as it is. -- DENNIS: Look, strange women lying on their backs in ponds handing out swords ... that's no basis for a system of government. Supreme executive power derives from a mandate from the masses, not from some farcical aquatic ceremony. "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.Moolenaar.net \\\ /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\ download, build and distribute -- http://www.A-A-P.org /// \\\ help me help AIDS victims -- http://ICCF-Holland.org ///