Patch 7.2.078
Patch 7.2.078 Problem:When deleting a fold that is specified with markers the cursor position may be wrong. Folds may not be displayed properly after a delete. Wrong fold may be deleted. Solution: Fix the problems. (mostly by Lech Lorens) Files: src/fold.c *** ../vim-7.2.077/src/fold.c Fri Nov 28 21:26:50 2008 --- src/fold.c Tue Jan 6 14:53:26 2009 *** *** 740,746 garray_T *found_ga; fold_T*found_fp = NULL; linenr_T found_off = 0; ! int use_level = FALSE; int maybe_small = FALSE; int level = 0; linenr_T lnum = start; --- 740,746 garray_T *found_ga; fold_T*found_fp = NULL; linenr_T found_off = 0; ! int use_level; int maybe_small = FALSE; int level = 0; linenr_T lnum = start; *** *** 757,762 --- 757,763 gap = curwin-w_folds; found_ga = NULL; lnum_off = 0; + use_level = FALSE; for (;;) { if (!foldFind(gap, lnum - lnum_off, fp)) *** *** 783,802 else { lnum = found_fp-fd_top + found_fp-fd_len + found_off; - did_one = TRUE; if (foldmethodIsManual(curwin)) deleteFoldEntry(found_ga, (int)(found_fp - (fold_T *)found_ga-ga_data), recursive); else { ! if (found_fp-fd_top + found_off first_lnum) ! first_lnum = found_fp-fd_top; ! if (lnum last_lnum) last_lnum = lnum; ! parseMarker(curwin); deleteFoldMarkers(found_fp, recursive, found_off); } /* redraw window */ changed_window_setting(); --- 784,804 else { lnum = found_fp-fd_top + found_fp-fd_len + found_off; if (foldmethodIsManual(curwin)) deleteFoldEntry(found_ga, (int)(found_fp - (fold_T *)found_ga-ga_data), recursive); else { ! if (first_lnum found_fp-fd_top + found_off) ! first_lnum = found_fp-fd_top + found_off; ! if (last_lnum lnum) last_lnum = lnum; ! if (!did_one) ! parseMarker(curwin); deleteFoldMarkers(found_fp, recursive, found_off); } + did_one = TRUE; /* redraw window */ changed_window_setting(); *** *** 811,816 --- 813,822 redraw_curbuf_later(INVERTED); #endif } + else + /* Deleting markers may make cursor column invalid. */ + check_cursor_col(); + if (last_lnum 0) changed_lines(first_lnum, (colnr_T)0, last_lnum, 0L); } *** ../vim-7.2.077/src/version.cWed Dec 31 16:20:54 2008 --- src/version.c Tue Jan 6 15:00:36 2009 *** *** 678,679 --- 678,681 { /* Add new patch number below this line */ + /**/ + 78, /**/ -- Looking at Perl through Lisp glasses, Perl looks atrocious. /// Bram Moolenaar -- b...@moolenaar.net -- http://www.Moolenaar.net \\\ ///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\download, build and distribute -- http://www.A-A-P.org/// \\\help me help AIDS victims -- http://ICCF-Holland.org/// --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Patch 7.2.079
Patch 7.2.079 Problem:killed netbeans events are not handled correctly. Solution: A killed netbeans event is sent when the buffer is deleted or wiped out (in this case, the netbeans annotations in this buffer have been removed). A user can still remove a sign with the command :sign unplace and this does not trigger a killed event. (Xavier de Gaye) Files: runtime/doc/netbeans.txt, src/buffer.c, src/globals.h, src/netbeans.c, src/proto/netbeans.pro *** ../vim-7.2.078/runtime/doc/netbeans.txt Sat Aug 9 19:36:49 2008 --- runtime/doc/netbeans.txtTue Jan 6 15:23:39 2009 *** *** 1,4 ! *netbeans.txt* For Vim version 7.2. Last change: 2008 Jun 28 VIM REFERENCE MANUALby Gordon Prieur et al. --- 1,4 ! *netbeans.txt* For Vim version 7.2. Last change: 2009 Jan 06 VIM REFERENCE MANUALby Gordon Prieur et al. *** *** 722,729 of the cursor. New in version 2.1. ! killedA file was closed by the user. Only for files that have been ! assigned a number by the IDE. newDotAndMark off off Reports the position of the cursor being at off bytes into --- 722,731 of the cursor. New in version 2.1. ! killedA file was deleted or wiped out by the user and the buffer ! annotations have been removed. The bufID number for this ! buffer has become invalid. Only for files that have been ! assigned a bufID number by the IDE. newDotAndMark off off Reports the position of the cursor being at off bytes into *** ../vim-7.2.078/src/buffer.c Wed Dec 3 11:21:20 2008 --- src/buffer.cTue Jan 6 15:23:02 2009 *** *** 437,446 return; #endif - #ifdef FEAT_NETBEANS_INTG - if (usingNetbeans) - netbeans_file_closed(buf); - #endif /* Change directories when the 'acd' option is set. */ DO_AUTOCHDIR --- 437,442 *** *** 639,644 --- 635,644 #ifdef FEAT_SIGNS buf_delete_signs(buf);/* delete any signs */ #endif + #ifdef FEAT_NETBEANS_INTG + if (usingNetbeans) + netbeans_file_killed(buf); + #endif #ifdef FEAT_LOCALMAP map_clear_int(buf, MAP_ALL_MODES, TRUE, FALSE); /* clear local mappings */ map_clear_int(buf, MAP_ALL_MODES, TRUE, TRUE); /* clear local abbrevs */ *** *** 815,823 int bnr;/* buffer number */ char_u*p; - #ifdef FEAT_NETBEANS_INTG - netbeansCloseFile = 1; - #endif if (addr_count == 0) { (void)do_buffer(command, DOBUF_CURRENT, FORWARD, 0, forceit); --- 815,820 *** *** 912,920 } } - #ifdef FEAT_NETBEANS_INTG - netbeansCloseFile = 0; - #endif return errormsg; } --- 909,914 *** ../vim-7.2.078/src/globals.hFri Nov 28 21:26:50 2008 --- src/globals.h Tue Jan 6 15:23:02 2009 *** *** 1340,1346 #ifdef FEAT_NETBEANS_INTG EXTERN char *netbeansArg INIT(= NULL);/* the -nb[:host:port:passwd] arg */ - EXTERN int netbeansCloseFile INIT(= 0); /* send killed if != 0 */ EXTERN int netbeansFireChanges INIT(= 1); /* send buffer changes if != 0 */ EXTERN int netbeansForcedQuit INIT(= 0);/* don't write modified files */ EXTERN int netbeansReadFile INIT(= 1);/* OK to read from disk if != 0 */ --- 1340,1345 *** ../vim-7.2.078/src/netbeans.c Wed Dec 24 12:20:10 2008 --- src/netbeans.c Tue Jan 6 15:23:02 2009 *** *** 2921,2964 } /* ! * Tell netbeans a file was closed. */ void ! netbeans_file_closed(buf_T *bufp) { int bufno = nb_getbufno(bufp); nbbuf_T *nbbuf = nb_get_buf(bufno); char buffer[2*MAXPATHL]; ! if (!haveConnection || bufno 0) return; ! if (!netbeansCloseFile) ! { ! nbdebug((Ignoring file_closed for %s. File was closed from IDE\n, ! bufp-b_ffname)); ! return; ! } ! ! nbdebug((netbeans_file_closed:\n)); ! nbdebug((Closing bufno: %d, bufno)); ! if (curbuf != NULL curbuf != bufp) ! { ! nbdebug((Curbuf bufno: %d\n, nb_getbufno(curbuf))); ! } ! else if (curbuf == bufp) ! { ! nbdebug((curbuf == bufp\n)); ! } ! ! if (bufno = 0) ! return; sprintf(buffer, %d:killed=%d\n, bufno, r_cmdno); nbdebug((EVT: %s, buffer)); ! nb_send(buffer, netbeans_file_closed); if (nbbuf != NULL) nbbuf-bufp = NULL; --- 2921,2946 } /* ! * Tell netbeans that a file was deleted or wiped out. */ void ! netbeans_file_killed(buf_T *bufp) { int bufno =
Re: [patch] bug in completion with i_CTRL-N using arabic or persian keymap
Dominique Pelle wrote: Hi. I can reproduce the following error with valgrind memory checker using Vim-7.2.75 (huge) on Linux x86 with utf-8 locale: ==15276== Invalid read of size 1 ==15276==at 0x4026438: strlen (mc_replace_strmem.c:242) ==15276==by 0x8107E39: ins_bytes (misc1.c:1860) ==15276==by 0x8067EC0: ins_compl_new_leader (edit.c:3212) ==15276==by 0x8068048: ins_compl_addleader (edit.c:3297) ==15276==by 0x80641AA: edit (edit.c:765) ==15276==by 0x812F248: invoke_edit (normal.c:8901) ==15276==by 0x812F1EE: nv_edit (normal.c:8874) ==15276==by 0x8122A3C: normal_cmd (normal.c:1200) ==15276==by 0x80E5C9D: main_loop (main.c:1180) ==15276==by 0x80E57EA: main (main.c:939) ==15276== Address 0x4e671af is 1 bytes before a block of size 3 alloc'd ==15276==at 0x4025D2E: malloc (vg_replace_malloc.c:207) ==15276==by 0x811303C: lalloc (misc2.c:859) ==15276==by 0x8112F58: alloc (misc2.c:758) ==15276==by 0x81133EF: vim_strnsave (misc2.c:1176) ==15276==by 0x8068035: ins_compl_addleader (edit.c:3294) ==15276==by 0x80641AA: edit (edit.c:765) ==15276==by 0x812F248: invoke_edit (normal.c:8901) ==15276==by 0x812F1EE: nv_edit (normal.c:8874) ==15276==by 0x8122A3C: normal_cmd (normal.c:1200) ==15276==by 0x80E5C9D: main_loop (main.c:1180) ==15276==by 0x80E57EA: main (main.c:939) Steps to reproduce: 1/ Create a sample tag file (using Vim source files for example): $ cd vim7/src $ ctags *.c *.h 2/ Create a minimalistic vimrc file enough to trigger bug: set completeopt=menuone,longest set tags=tags set keymap=arabic I tried several keymaps (not all of them), but I can somehow only reproduce this bug using 'set keymap=arabic' or 'set keymap=persian'. 3/ Start Vim with valgrind: $ valgrind vim -u test.vimrc 2 valgrind.log 4/ Type the following commands in Normal mode (completion using pum tags): i-ctrl-nX 5/ Observe the above valgrind error in valgrind.log right after pressing X in step 4/ edit.c, around line 3212: 3207 static void 3208 ins_compl_new_leader() 3209 { 3210 ins_compl_del_pum(); 3211 ins_compl_delete(); ! 3212 ins_bytes(compl_leader + curwin-w_cursor.col - compl_col); 3213 compl_used_match = FALSE; When bug happens, I see that curwin-w_cursor.col is 0, and compl_col is 1, so argument of ins_bytes() at line 3212 is 1 byte before beginning of string compl_leader (hence the error). Without keymap, or with other keymaps than arabic or persian, curwin-w_cursor.col is 1 and compl_col is also 1 (so bug then does not happen). I'm not sure what's the right way to fix it: obviously we can do a check for curwin-w_cursor.col being greater or equal than compl_col as in attached patch. Although it pacifies valgrind, I may only work around the bug. I was testing Vim with keymaps and I don't know how arabic and persian keymaps are supposed to behave to tell whether the behavior is correct (but the valgrind error is clearly not expected). It turns out that the X inserted is keymapped to a composing character. When deleting the completed text this causes the character bofore it, the -, also to be deleted. The patch below avoids deleting too much. And also avoids the offset going negative, in case there is another situation where this happens. Let me know if there are any remaining (or new) problems after this patch. *** ../vim-7.2.079/src/edit.c Wed Aug 6 18:56:55 2008 --- src/edit.c Tue Jan 6 18:55:16 2009 *** *** 147,152 --- 147,153 static int ins_compl_bs __ARGS((void)); static void ins_compl_new_leader __ARGS((void)); static void ins_compl_addleader __ARGS((int c)); + static int ins_compl_len __ARGS((void)); static void ins_compl_restart __ARGS((void)); static void ins_compl_set_original_text __ARGS((char_u *str)); static void ins_compl_addfrommatch __ARGS((void)); *** *** 1933,1938 --- 1934,1941 /* * Backspace the cursor until the given column. Handles REPLACE and VREPLACE * modes correctly. May also be used when not in insert mode at all. + * Will attempt not to go before col even when there is a composing + * character. */ void backspace_until_column(col) *** *** 1944,1950 if (State REPLACE_FLAG) replace_do_bs(); else ! (void)del_char(FALSE); } } #endif --- 1947,1978 if (State REPLACE_FLAG) replace_do_bs(); else ! { ! #ifdef FEAT_MBYTE ! if (enc_utf8) ! { ! int ecol = curwin-w_cursor.col + 1; ! ! /* Make sure the cursor is at the start of a character, but !* skip forward again when going too far back because of a !* composing character. */ ! mb_adjust_cursor(); !
Is vim really fully unicoded?
Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Best regards, Tony. -- Q: How many hardware engineers does it take to change a lightbulb? A: None. We'll fix it in software. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. *Nod* If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. No, you could get some byte of some entirely different character. Ie, on a line with two 2-byte characters, getline('.')[col('.')-1] on the second character would return the 2nd byte of the first character. The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Using virtcol() there seems broken... what if you're in the middle of a tab, for example, with virtualedit=all? :echo join(split(áéíóú, '\zs')[1:3], '') is how I would do it... but, is there any real reason why indexing into a string *should* be byte oriented instead of character oriented, apart from backwards compatibility? It seems drastically less easy to use the thing that more people want to use more of the time; and in fact some of the snippets in the vim help (like the example given at :help expr-8) won't work on multibyte lines given the way that string indexing works now. It seems like a place where the cost of losing backwards compatibility might be outweighed by the cost of keeping things the way they are... ~Matt --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 07/01/09 00:39, Matt Wozniski wrote: On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. *Nod* If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. No, you could get some byte of some entirely different character. Ie, on a line with two 2-byte characters, getline('.')[col('.')-1] on the second character would return the 2nd byte of the first character. col() gives a one-based byte ordinal. [] takes a zero-based argument. I stand by what I said. The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Using virtcol() there seems broken... what if you're in the middle of a tab, for example, with virtualedit=all? :echo join(split(áéíóú, '\zs')[1:3], '') OK, I didn't think of virtual editing, nor even, it seems, of multi-column characters such as tabs and fullwidth CJK. However, [1:3] wouldn't work because the idea is that we're in a script, we don't know that we're in the 1st, 2nd or 3rd column, just that we want whatever is at the cursor. I might do it with function CursorChar() normal yl return @@ endfunction is how I would do it... but, is there any real reason why indexing into a string *should* be byte oriented instead of character oriented, apart from backwards compatibility? It seems drastically less easy to use the thing that more people want to use more of the time; and in fact some of the snippets in the vim help (like the example given at :help expr-8) won't work on multibyte lines given the way that string indexing works now. It seems like a place where the cost of losing backwards compatibility might be outweighed by the cost of keeping things the way they are... ~Matt Changing an existing construct from byte-oriented to multibyte-character-oriented would probably break a lot of existing scripts. I don't believe Bram would ever accept that. Best regards, Tony. -- A programmer is a person who passes as an exacting expert on the basis of being able to turn out, after innumerable punching, an infinite series of incomprehensive answers calculated with micrometric precisions from vague assumptions based on debatable figures taken from inconclusive documents and carried out on instruments of problematical accuracy by persons of dubious reliability and questionable mentality for the avowed purpose of annoying and confounding a hopelessly defenseless department that was unfortunate enough to ask for the information in the first place. -- IEEE Grid news magazine --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 1/6/09, Tony Mechelynck wrote: On 07/01/09 00:39, Matt Wozniski wrote: On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. *Nod* If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. No, you could get some byte of some entirely different character. Ie, on a line with two 2-byte characters, getline('.')[col('.')-1] on the second character would return the 2nd byte of the first character. col() gives a one-based byte ordinal. [] takes a zero-based argument. I stand by what I said. Ooh, you're right - I forgot col() returned a byte index, and not the column as its name would imply... The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Using virtcol() there seems broken... what if you're in the middle of a tab, for example, with virtualedit=all? :echo join(split(áéíóú, '\zs')[1:3], '') OK, I didn't think of virtual editing, nor even, it seems, of multi-column characters such as tabs and fullwidth CJK. However, [1:3] wouldn't work because the idea is that we're in a script, we don't know that we're in the 1st, 2nd or 3rd column, just that we want whatever is at the cursor. I might do it with function CursorChar() normal yl return @@ endfunction echo matchstr(getline('.'), '\%' . col('.') . 'c.') does the same thing without clobbering the unnamed register... slightly more elegant, imho. is how I would do it... but, is there any real reason why indexing into a string *should* be byte oriented instead of character oriented, apart from backwards compatibility? It seems drastically less easy to use the thing that more people want to use more of the time; and in fact some of the snippets in the vim help (like the example given at :help expr-8) won't work on multibyte lines given the way that string indexing works now. It seems like a place where the cost of losing backwards compatibility might be outweighed by the cost of keeping things the way they are... Changing an existing construct from byte-oriented to multibyte-character-oriented would probably break a lot of existing scripts. I don't believe Bram would ever accept that. But sometimes, breaking things is required to make progress. The fact that we're having a conversation with both of us suggesting (fairly complicated) things that haven't worked is a perfect proof for the fact that the current system is counterintuitive and hard to use... ~Matt --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 07/01/09 02:14, Matt Wozniski wrote: On 1/6/09, Tony Mechelynck wrote: On 07/01/09 00:39, Matt Wozniski wrote: On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. *Nod* If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. No, you could get some byte of some entirely different character. Ie, on a line with two 2-byte characters, getline('.')[col('.')-1] on the second character would return the 2nd byte of the first character. col() gives a one-based byte ordinal. [] takes a zero-based argument. I stand by what I said. Ooh, you're right - I forgot col() returned a byte index, and not the column as its name would imply... The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Using virtcol() there seems broken... what if you're in the middle of a tab, for example, with virtualedit=all? :echo join(split(áéíóú, '\zs')[1:3], '') OK, I didn't think of virtual editing, nor even, it seems, of multi-column characters such as tabs and fullwidth CJK. However, [1:3] wouldn't work because the idea is that we're in a script, we don't know that we're in the 1st, 2nd or 3rd column, just that we want whatever is at the cursor. I might do it with function CursorChar() normal yl return @@ endfunction echo matchstr(getline('.'), '\%' . col('.') . 'c.') Again, col('.') is a byte index, not a column. What about virtcol('.') instead? To avoid clobbering @@ I could save/restore it. does the same thing without clobbering the unnamed register... slightly more elegant, imho. is how I would do it... but, is there any real reason why indexing into a string *should* be byte oriented instead of character oriented, apart from backwards compatibility? It seems drastically less easy to use the thing that more people want to use more of the time; and in fact some of the snippets in the vim help (like the example given at :help expr-8) won't work on multibyte lines given the way that string indexing works now. It seems like a place where the cost of losing backwards compatibility might be outweighed by the cost of keeping things the way they are... Changing an existing construct from byte-oriented to multibyte-character-oriented would probably break a lot of existing scripts. I don't believe Bram would ever accept that. But sometimes, breaking things is required to make progress. The fact that we're having a conversation with both of us suggesting (fairly complicated) things that haven't worked is a perfect proof for the fact that the current system is counterintuitive and hard to use... ~Matt That's no reason for breaking what does work. I don't mind counterintuitive as long as it's documented. Best regards, Tony. -- They told me you had proven it When they discovered our results About a month before. Their hair began to curl The proof was valid, more or less Instead of understanding it But rather less than more. We'd run the thing through PRL. He sent them word that we would try Don't tell a soul about all this To pass where they had failed For it must ever be And after we were done, to them A secret, kept from all the rest The new proof would be mailed. Between yourself and me. My notion was to start again Ignoring all they'd done We quickly turned it into code To see if it would run. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote: On 07/01/09 00:39, Matt Wozniski wrote: On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. *Nod* If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. No, you could get some byte of some entirely different character. Ie, on a line with two 2-byte characters, getline('.')[col('.')-1] on the second character would return the 2nd byte of the first character. col() gives a one-based byte ordinal. [] takes a zero-based argument. I stand by what I said. The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Using virtcol() there seems broken... what if you're in the middle of a tab, for example, with virtualedit=all? :echo join(split(áéíóú, '\zs')[1:3], '') OK, I didn't think of virtual editing, nor even, it seems, of multi-column characters such as tabs and fullwidth CJK. However, [1:3] wouldn't work because the idea is that we're in a script, we don't know that we're in the 1st, 2nd or 3rd column, just that we want whatever is at the cursor. I might do it with function CursorChar() normal yl return @@ endfunction is how I would do it... but, is there any real reason why indexing into a string *should* be byte oriented instead of character oriented, apart from backwards compatibility? It seems drastically less easy to use the thing that more people want to use more of the time; and in fact some of the snippets in the vim help (like the example given at :help expr-8) won't work on multibyte lines given the way that string indexing works now. It seems like a place where the cost of losing backwards compatibility might be outweighed by the cost of keeping things the way they are... ~Matt Changing an existing construct from byte-oriented to multibyte-character-oriented would probably break a lot of existing scripts. I don't believe Bram would ever accept that. Best regards, Tony. Hmm, I think I got the point. btw, I tested your func on a line with 测试(test) let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Then echo character got nothing. -- Regards, Van. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 07/01/09 02:10, Yue Wu wrote: On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote: On 07/01/09 00:39, Matt Wozniski wrote: On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. *Nod* If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. No, you could get some byte of some entirely different character. Ie, on a line with two 2-byte characters, getline('.')[col('.')-1] on the second character would return the 2nd byte of the first character. col() gives a one-based byte ordinal. [] takes a zero-based argument. I stand by what I said. The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Using virtcol() there seems broken... what if you're in the middle of a tab, for example, with virtualedit=all? :echo join(split(áéíóú, '\zs')[1:3], '') OK, I didn't think of virtual editing, nor even, it seems, of multi-column characters such as tabs and fullwidth CJK. However, [1:3] wouldn't work because the idea is that we're in a script, we don't know that we're in the 1st, 2nd or 3rd column, just that we want whatever is at the cursor. I might do it with function CursorChar() normal yl return @@ endfunction is how I would do it... but, is there any real reason why indexing into a string *should* be byte oriented instead of character oriented, apart from backwards compatibility? It seems drastically less easy to use the thing that more people want to use more of the time; and in fact some of the snippets in the vim help (like the example given at :help expr-8) won't work on multibyte lines given the way that string indexing works now. It seems like a place where the cost of losing backwards compatibility might be outweighed by the cost of keeping things the way they are... ~Matt Changing an existing construct from byte-oriented to multibyte-character-oriented would probably break a lot of existing scripts. I don't believe Bram would ever accept that. Best regards, Tony. Hmm, I think I got the point. btw, I tested your func on a line with 测试(test) let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Then echo character got nothing. Try the function in my next post. If you don't want to clobber the unnamed register, here is a variant: function CursorChar() let unnamed = @@ normal yl let retval = @@ let @@ = unnamed return retval endfunction Best regards, Tony. -- If you had any brains, you'd be dangerous. Best regards, Tony. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On Wed, 07 Jan 2009 10:24:30 +0800, Tony Mechelynck wrote: On 07/01/09 02:10, Yue Wu wrote: On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote: On 07/01/09 00:39, Matt Wozniski wrote: On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: On 06/01/09 12:31, anhnmncb wrote: Hi, list, as title, if so, why can't many functions still handle correctly with unicode? For example the func: getline('.')[col('.')-1] Can't return a charactor outside the range of ascii. because string[index] returns a byte value, not a character value: see :help expr8. *Nod* If the character at the cursor is U+007F, you'll get the first byte (in the range 0xC0-0xFD, or in practice in the range 0xC0-0xF4) of its UTF-8 representation. No, you could get some byte of some entirely different character. Ie, on a line with two 2-byte characters, getline('.')[col('.')-1] on the second character would return the 2nd byte of the first character. col() gives a one-based byte ordinal. [] takes a zero-based argument. I stand by what I said. The _character_ at the cursor is obtained as follows: let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Using virtcol() there seems broken... what if you're in the middle of a tab, for example, with virtualedit=all? :echo join(split(áéíóú, '\zs')[1:3], '') OK, I didn't think of virtual editing, nor even, it seems, of multi-column characters such as tabs and fullwidth CJK. However, [1:3] wouldn't work because the idea is that we're in a script, we don't know that we're in the 1st, 2nd or 3rd column, just that we want whatever is at the cursor. I might do it with function CursorChar() normal yl return @@ endfunction is how I would do it... but, is there any real reason why indexing into a string *should* be byte oriented instead of character oriented, apart from backwards compatibility? It seems drastically less easy to use the thing that more people want to use more of the time; and in fact some of the snippets in the vim help (like the example given at :help expr-8) won't work on multibyte lines given the way that string indexing works now. It seems like a place where the cost of losing backwards compatibility might be outweighed by the cost of keeping things the way they are... ~Matt Changing an existing construct from byte-oriented to multibyte-character-oriented would probably break a lot of existing scripts. I don't believe Bram would ever accept that. Best regards, Tony. Hmm, I think I got the point. btw, I tested your func on a line with 测试(test) let i0 = byteidx(getline('.'), virtcol('.') - 1) let i1 = byteidx(getline('.'), virtcol('.')) let character = strpart(getline('.'), i0, i1 - 10) Then echo character got nothing. Try the function in my next post. If you don't want to clobber the unnamed register, here is a variant: function CursorChar() let unnamed = @@ normal yl let retval = @@ let @@ = unnamed return retval endfunction Yes, it works, but I don't like a function that contains normal operators, I always think that a normal operator is only used for normal mode by keyboard, if write a function, it's better to use the function coressponding to the operator. This version works fine: matchstr(getline('.'), '\%' . col('.') . 'c.') whereas this one doesn't: matchstr(getline('.'), '\%' . virtcol('.') . 'c.') Best regards, Tony. -- Regards, Van. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 07/01/09 03:38, Yue Wu wrote: [...] I always think that a normal operator is only used for normal mode by keyboard,[...] Oh? I have the opposite impression. For normal mode by keyboard, I don't use :normal ylEnter but yl To me, the :normal command is _only_ useful in scripts, in order to run in Ex mode the key sequences meant for Normal mode. Best regards, Tony. -- If bankers can count, how come they have eight windows and only four tellers? --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On Wed, 07 Jan 2009 10:55:33 +0800, Tony Mechelynck wrote: On 07/01/09 03:38, Yue Wu wrote: [...] I always think that a normal operator is only used for normal mode by keyboard,[...] Oh? I have the opposite impression. For normal mode by keyboard, I don't use :normal ylEnter but yl To me, the :normal command is _only_ useful in scripts, in order to run in Ex mode the key sequences meant for Normal mode. I mean I prevent using yl from :normal if there is a function :yank :) -- Regards, Van. --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 07/01/09 04:17, Yue Wu wrote: On Wed, 07 Jan 2009 10:55:33 +0800, Tony Mechelynck wrote: On 07/01/09 03:38, Yue Wu wrote: [...] I always think that a normal operator is only used for normal mode by keyboard,[...] Oh? I have the opposite impression. For normal mode by keyboard, I don't use :normal ylEnter but yl To me, the :normal command is _only_ useful in scripts, in order to run in Ex mode the key sequences meant for Normal mode. I mean I prevent using yl from :normal if there is a function :yank :) There is a :yank command but it acts linewise. Here we want a characterwise yank, so we cannot use :yank. The function you proposed is so complex I would run much more risk when trying to construct it than with :normal yl. If the complexity is similar, I use the ex-command in scripts, for instance :wincmd k rather than :normal ^Wk where ^W would be obtained by hitting Ctrl-V followed by Ctrl-W. Best regards, Tony. -- ARTHUR: Shut up! Will you shut up! DENNIS: Ah, now we see the violence inherent in the system. ARTHUR: Shut up! DENNIS: Oh! Come and see the violence inherent in the system! HELP! HELP! I'm being repressed! The Quest for the Holy Grail (Monty Python) --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---
Re: Is vim really fully unicoded?
On 1/6/09, Tony Mechelynck wrote: On 1/6/09, Matt Wozniski wrote: echo matchstr(getline('.'), '\%' . col('.') . 'c.') Again, col('.') is a byte index, not a column. What about virtcol('.') instead? Nope. \%15c is also a byte index, not a column (which is also counter-intuitive, and brings us back to the problem - that however well documented it is, even experienced vimscript programmers get this stuff wrong regularly.) Changing an existing construct from byte-oriented to multibyte-character-oriented would probably break a lot of existing scripts. I don't believe Bram would ever accept that. But sometimes, breaking things is required to make progress. The fact that we're having a conversation with both of us suggesting (fairly complicated) things that haven't worked is a perfect proof for the fact that the current system is counterintuitive and hard to use... That's no reason for breaking what does work. I don't mind counterintuitive as long as it's documented. See above. If no one can remember how to use it, or the workarounds to make it work are worth more trouble to the author than the trouble of not having it work on multibyte input, I'd say that it _doesn't_ work as is. In fact, I'd argue that having string indexing be byte-oriented after multibyte was added was a regression that broke things that did work: before, getline('.')[col('.')-1] was a valid way to get the character under the cursor, and afterwards it was not. Changing this behavior would probably break very few scripts, since I doubt most scripters are defensive about doing it correctly, and would mean that all the broken code that already exists, and even the code that was written before proper multibyte support was added (I believe it was added after vimscript, right?), would continue to work *unless* it was written intentionally to work around this issue. And I think that authors who knew enough to work around this would, by and large, be happy to see it fixed. I think that the advantages of having new scripts work the way that they should, instead of the way that they do, would greatly outweigh the disadvantages of breaking scripts depending on the broken behavior. But, Bram's opinion is the final answer, so we'll see if he weighs in. ~Matt --~--~-~--~~~---~--~~ You received this message from the vim_dev maillist. For more information, visit http://www.vim.org/maillist.php -~--~~~~--~~--~--~---