Patch 7.2.078

2009-01-06 Fir de Conversatie Bram Moolenaar


Patch 7.2.078
Problem:When deleting a fold that is specified with markers the cursor
position may be wrong.  Folds may not be displayed properly after
a delete.  Wrong fold may be deleted.
Solution:   Fix the problems. (mostly by Lech Lorens)
Files:  src/fold.c


*** ../vim-7.2.077/src/fold.c   Fri Nov 28 21:26:50 2008
--- src/fold.c  Tue Jan  6 14:53:26 2009
***
*** 740,746 
  garray_T  *found_ga;
  fold_T*found_fp = NULL;
  linenr_T  found_off = 0;
! int   use_level = FALSE;
  int   maybe_small = FALSE;
  int   level = 0;
  linenr_T  lnum = start;
--- 740,746 
  garray_T  *found_ga;
  fold_T*found_fp = NULL;
  linenr_T  found_off = 0;
! int   use_level;
  int   maybe_small = FALSE;
  int   level = 0;
  linenr_T  lnum = start;
***
*** 757,762 
--- 757,763 
gap = curwin-w_folds;
found_ga = NULL;
lnum_off = 0;
+   use_level = FALSE;
for (;;)
{
if (!foldFind(gap, lnum - lnum_off, fp))
***
*** 783,802 
else
{
lnum = found_fp-fd_top + found_fp-fd_len + found_off;
-   did_one = TRUE;
  
if (foldmethodIsManual(curwin))
deleteFoldEntry(found_ga,
(int)(found_fp - (fold_T *)found_ga-ga_data), recursive);
else
{
!   if (found_fp-fd_top + found_off  first_lnum)
!   first_lnum = found_fp-fd_top;
!   if (lnum  last_lnum)
last_lnum = lnum;
!   parseMarker(curwin);
deleteFoldMarkers(found_fp, recursive, found_off);
}
  
/* redraw window */
changed_window_setting();
--- 784,804 
else
{
lnum = found_fp-fd_top + found_fp-fd_len + found_off;
  
if (foldmethodIsManual(curwin))
deleteFoldEntry(found_ga,
(int)(found_fp - (fold_T *)found_ga-ga_data), recursive);
else
{
!   if (first_lnum  found_fp-fd_top + found_off)
!   first_lnum = found_fp-fd_top + found_off;
!   if (last_lnum  lnum)
last_lnum = lnum;
!   if (!did_one)
!   parseMarker(curwin);
deleteFoldMarkers(found_fp, recursive, found_off);
}
+   did_one = TRUE;
  
/* redraw window */
changed_window_setting();
***
*** 811,816 
--- 813,822 
redraw_curbuf_later(INVERTED);
  #endif
  }
+ else
+   /* Deleting markers may make cursor column invalid. */
+   check_cursor_col();
+ 
  if (last_lnum  0)
changed_lines(first_lnum, (colnr_T)0, last_lnum, 0L);
  }
*** ../vim-7.2.077/src/version.cWed Dec 31 16:20:54 2008
--- src/version.c   Tue Jan  6 15:00:36 2009
***
*** 678,679 
--- 678,681 
  {   /* Add new patch number below this line */
+ /**/
+ 78,
  /**/

-- 
Looking at Perl through Lisp glasses, Perl looks atrocious.

 /// Bram Moolenaar -- b...@moolenaar.net -- http://www.Moolenaar.net   \\\
///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\download, build and distribute -- http://www.A-A-P.org///
 \\\help me help AIDS victims -- http://ICCF-Holland.org///

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Patch 7.2.079

2009-01-06 Fir de Conversatie Bram Moolenaar


Patch 7.2.079
Problem:killed netbeans events are not handled correctly.
Solution:   A killed netbeans event is sent when the buffer is deleted or
wiped out (in this case, the netbeans annotations in this buffer
have been removed).  A user can still remove a sign with the
command :sign unplace and this does not trigger a killed
event.  (Xavier de Gaye)
Files:  runtime/doc/netbeans.txt, src/buffer.c, src/globals.h,
src/netbeans.c, src/proto/netbeans.pro


*** ../vim-7.2.078/runtime/doc/netbeans.txt Sat Aug  9 19:36:49 2008
--- runtime/doc/netbeans.txtTue Jan  6 15:23:39 2009
***
*** 1,4 
! *netbeans.txt*  For Vim version 7.2.  Last change: 2008 Jun 28
  
  
  VIM REFERENCE MANUALby Gordon Prieur et al.
--- 1,4 
! *netbeans.txt*  For Vim version 7.2.  Last change: 2009 Jan 06
  
  
  VIM REFERENCE MANUALby Gordon Prieur et al.
***
*** 722,729 
of the cursor.
New in version 2.1.
  
! killedA file was closed by the user.  Only for files that 
have been
!   assigned a number by the IDE.
  
  newDotAndMark off off
Reports the position of the cursor being at off bytes into
--- 722,731 
of the cursor.
New in version 2.1.
  
! killedA file was deleted or wiped out by the user and the 
buffer
!   annotations have been removed.  The bufID number for this
!   buffer has become invalid.  Only for files that have been
!   assigned a bufID number by the IDE.
  
  newDotAndMark off off
Reports the position of the cursor being at off bytes into
*** ../vim-7.2.078/src/buffer.c Wed Dec  3 11:21:20 2008
--- src/buffer.cTue Jan  6 15:23:02 2009
***
*** 437,446 
return;
  #endif
  
- #ifdef FEAT_NETBEANS_INTG
- if (usingNetbeans)
-   netbeans_file_closed(buf);
- #endif
  /* Change directories when the 'acd' option is set. */
  DO_AUTOCHDIR
  
--- 437,442 
***
*** 639,644 
--- 635,644 
  #ifdef FEAT_SIGNS
  buf_delete_signs(buf);/* delete any signs */
  #endif
+ #ifdef FEAT_NETBEANS_INTG
+ if (usingNetbeans)
+ netbeans_file_killed(buf);
+ #endif
  #ifdef FEAT_LOCALMAP
  map_clear_int(buf, MAP_ALL_MODES, TRUE, FALSE);  /* clear local mappings 
*/
  map_clear_int(buf, MAP_ALL_MODES, TRUE, TRUE);   /* clear local abbrevs */
***
*** 815,823 
  int   bnr;/* buffer number */
  char_u*p;
  
- #ifdef FEAT_NETBEANS_INTG
- netbeansCloseFile = 1;
- #endif
  if (addr_count == 0)
  {
(void)do_buffer(command, DOBUF_CURRENT, FORWARD, 0, forceit);
--- 815,820 
***
*** 912,920 
}
  }
  
- #ifdef FEAT_NETBEANS_INTG
- netbeansCloseFile = 0;
- #endif
  
  return errormsg;
  }
--- 909,914 
*** ../vim-7.2.078/src/globals.hFri Nov 28 21:26:50 2008
--- src/globals.h   Tue Jan  6 15:23:02 2009
***
*** 1340,1346 
  
  #ifdef FEAT_NETBEANS_INTG
  EXTERN char *netbeansArg INIT(= NULL);/* the -nb[:host:port:passwd] 
arg */
- EXTERN int netbeansCloseFile INIT(= 0);   /* send killed if != 0 */
  EXTERN int netbeansFireChanges INIT(= 1); /* send buffer changes if != 0 */
  EXTERN int netbeansForcedQuit INIT(= 0);/* don't write modified files */
  EXTERN int netbeansReadFile INIT(= 1);/* OK to read from disk if != 0 
*/
--- 1340,1345 
*** ../vim-7.2.078/src/netbeans.c   Wed Dec 24 12:20:10 2008
--- src/netbeans.c  Tue Jan  6 15:23:02 2009
***
*** 2921,2964 
  }
  
  /*
!  * Tell netbeans a file was closed.
   */
  void
! netbeans_file_closed(buf_T *bufp)
  {
  int   bufno = nb_getbufno(bufp);
  nbbuf_T   *nbbuf = nb_get_buf(bufno);
  char  buffer[2*MAXPATHL];
  
! if (!haveConnection || bufno  0)
return;
  
! if (!netbeansCloseFile)
! {
!   nbdebug((Ignoring file_closed for %s. File was closed from IDE\n,
!   bufp-b_ffname));
!   return;
! }
! 
! nbdebug((netbeans_file_closed:\n));
! nbdebug((Closing bufno: %d, bufno));
! if (curbuf != NULL  curbuf != bufp)
! {
!   nbdebug((Curbuf bufno:  %d\n, nb_getbufno(curbuf)));
! }
! else if (curbuf == bufp)
! {
!   nbdebug((curbuf == bufp\n));
! }
! 
! if (bufno = 0)
!   return;
  
  sprintf(buffer, %d:killed=%d\n, bufno, r_cmdno);
  
  nbdebug((EVT: %s, buffer));
  
! nb_send(buffer, netbeans_file_closed);
  
  if (nbbuf != NULL)
nbbuf-bufp = NULL;
--- 2921,2946 
  }
  
  /*
!  * Tell netbeans that a file was deleted or wiped out.
   */
  void
! netbeans_file_killed(buf_T *bufp)
  {
  int   bufno = 

Re: [patch] bug in completion with i_CTRL-N using arabic or persian keymap

2009-01-06 Fir de Conversatie Bram Moolenaar


Dominique Pelle wrote:

 Hi.
 
 I can reproduce the following error with valgrind memory
 checker using Vim-7.2.75 (huge) on Linux x86 with utf-8 locale:
 
 ==15276== Invalid read of size 1
 ==15276==at 0x4026438: strlen (mc_replace_strmem.c:242)
 ==15276==by 0x8107E39: ins_bytes (misc1.c:1860)
 ==15276==by 0x8067EC0: ins_compl_new_leader (edit.c:3212)
 ==15276==by 0x8068048: ins_compl_addleader (edit.c:3297)
 ==15276==by 0x80641AA: edit (edit.c:765)
 ==15276==by 0x812F248: invoke_edit (normal.c:8901)
 ==15276==by 0x812F1EE: nv_edit (normal.c:8874)
 ==15276==by 0x8122A3C: normal_cmd (normal.c:1200)
 ==15276==by 0x80E5C9D: main_loop (main.c:1180)
 ==15276==by 0x80E57EA: main (main.c:939)
 ==15276==  Address 0x4e671af is 1 bytes before a block of size 3 alloc'd
 ==15276==at 0x4025D2E: malloc (vg_replace_malloc.c:207)
 ==15276==by 0x811303C: lalloc (misc2.c:859)
 ==15276==by 0x8112F58: alloc (misc2.c:758)
 ==15276==by 0x81133EF: vim_strnsave (misc2.c:1176)
 ==15276==by 0x8068035: ins_compl_addleader (edit.c:3294)
 ==15276==by 0x80641AA: edit (edit.c:765)
 ==15276==by 0x812F248: invoke_edit (normal.c:8901)
 ==15276==by 0x812F1EE: nv_edit (normal.c:8874)
 ==15276==by 0x8122A3C: normal_cmd (normal.c:1200)
 ==15276==by 0x80E5C9D: main_loop (main.c:1180)
 ==15276==by 0x80E57EA: main (main.c:939)
 
 Steps to reproduce:
 
 1/ Create a sample tag file (using Vim source files for example):
 
   $ cd vim7/src
   $ ctags *.c *.h
 
 2/ Create a minimalistic vimrc file enough to trigger bug:
 
   set completeopt=menuone,longest
   set tags=tags
   set keymap=arabic
 
I tried several keymaps (not all of them), but I can somehow only
reproduce this bug using 'set keymap=arabic' or 'set keymap=persian'.
 
 3/ Start Vim with valgrind:
 
   $ valgrind vim -u test.vimrc 2 valgrind.log
 
 4/ Type the following commands in Normal mode (completion using pum  tags):
 
   i-ctrl-nX
 
 5/ Observe the above valgrind error in valgrind.log right after
pressing X in step 4/
 
 edit.c, around line 3212:
 
   3207 static void
   3208 ins_compl_new_leader()
   3209 {
   3210 ins_compl_del_pum();
   3211 ins_compl_delete();
 ! 3212 ins_bytes(compl_leader + curwin-w_cursor.col - compl_col);
   3213 compl_used_match = FALSE;
 
 When bug happens, I see that curwin-w_cursor.col is 0, and compl_col
 is 1, so argument of ins_bytes() at line 3212 is 1 byte before beginning
 of string compl_leader (hence the error).  Without keymap, or with
 other keymaps than arabic or persian, curwin-w_cursor.col is 1 and
 compl_col is also 1 (so bug then does not happen).
 
 I'm not sure what's the right way to fix it: obviously we can do
 a check for curwin-w_cursor.col being greater or equal than compl_col
 as in attached patch.  Although it pacifies valgrind, I may only work
 around the bug.  I was testing Vim with keymaps and I don't know how
 arabic and persian keymaps are supposed to behave to tell whether the
 behavior is correct (but the valgrind error is clearly not expected).

It turns out that the X inserted is keymapped to a composing
character.  When deleting the completed text this causes the character
bofore it, the -, also to be deleted.

The patch below avoids deleting too much.  And also avoids the offset
going negative, in case there is another situation where this happens.

Let me know if there are any remaining (or new) problems after this
patch.

*** ../vim-7.2.079/src/edit.c   Wed Aug  6 18:56:55 2008
--- src/edit.c  Tue Jan  6 18:55:16 2009
***
*** 147,152 
--- 147,153 
  static int  ins_compl_bs __ARGS((void));
  static void ins_compl_new_leader __ARGS((void));
  static void ins_compl_addleader __ARGS((int c));
+ static int ins_compl_len __ARGS((void));
  static void ins_compl_restart __ARGS((void));
  static void ins_compl_set_original_text __ARGS((char_u *str));
  static void ins_compl_addfrommatch __ARGS((void));
***
*** 1933,1938 
--- 1934,1941 
  /*
   * Backspace the cursor until the given column.  Handles REPLACE and VREPLACE
   * modes correctly.  May also be used when not in insert mode at all.
+  * Will attempt not to go before col even when there is a composing
+  * character.
   */
  void
  backspace_until_column(col)
***
*** 1944,1950 
if (State  REPLACE_FLAG)
replace_do_bs();
else
!   (void)del_char(FALSE);
  }
  }
  #endif
--- 1947,1978 
if (State  REPLACE_FLAG)
replace_do_bs();
else
!   {
! #ifdef FEAT_MBYTE
!   if (enc_utf8)
!   {
!   int ecol = curwin-w_cursor.col + 1;
! 
!   /* Make sure the cursor is at the start of a character, but
!* skip forward again when going too far back because of a
!* composing character. */
!   mb_adjust_cursor();
!   

Is vim really fully unicoded?

2009-01-06 Fir de Conversatie anhnmncb

Hi, list, as title, if so, why can't many functions
still handle correctly with unicode? For example the func:

getline('.')[col('.')-1]

Can't return a charactor outside the range of ascii.

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Tony Mechelynck

On 06/01/09 12:31, anhnmncb wrote:
 Hi, list, as title, if so, why can't many functions
 still handle correctly with unicode? For example the func:

   getline('.')[col('.')-1]

 Can't return a charactor outside the range of ascii.


because string[index] returns a byte value, not a character value: see 
:help expr8. If the character at the cursor is  U+007F, you'll get 
the first byte (in the range 0xC0-0xFD, or in practice in the range 
0xC0-0xF4) of its UTF-8 representation.

The _character_ at the cursor is obtained as follows:
let i0 = byteidx(getline('.'), virtcol('.') - 1)
let i1 = byteidx(getline('.'), virtcol('.'))
let character = strpart(getline('.'), i0, i1 - 10)

Best regards,
Tony.
-- 
Q:   How many hardware engineers does it take to change a lightbulb?
A:   None.  We'll fix it in software.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Matt Wozniski
On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:

 On 06/01/09 12:31, anhnmncb wrote:
 Hi, list, as title, if so, why can't many functions
 still handle correctly with unicode? For example the func:

   getline('.')[col('.')-1]

 Can't return a charactor outside the range of ascii.


 because string[index] returns a byte value, not a character value: see
 :help expr8.

*Nod*

  If the character at the cursor is  U+007F, you'll get
 the first byte (in the range 0xC0-0xFD, or in practice in the range
 0xC0-0xF4) of its UTF-8 representation.

No, you could get some byte of some entirely different character.  Ie,
on a line with two 2-byte characters, getline('.')[col('.')-1] on the
second character would return the 2nd byte of the first character.

 The _character_ at the cursor is obtained as follows:
let i0 = byteidx(getline('.'), virtcol('.') - 1)
let i1 = byteidx(getline('.'), virtcol('.'))
let character = strpart(getline('.'), i0, i1 - 10)

Using virtcol() there seems broken... what if you're in the middle of
a tab, for example, with virtualedit=all?

:echo join(split(áéíóú, '\zs')[1:3], '')

is how I would do it... but, is there any real reason why indexing
into a string *should* be byte oriented instead of character oriented,
apart from backwards compatibility?  It seems drastically less easy to
use the thing that more people want to use more of the time; and in
fact some of the snippets in the vim help (like the example given at
:help expr-8) won't work on multibyte lines given the way that string
indexing works now.  It seems like a place where the cost of losing
backwards compatibility might be outweighed by the cost of keeping
things the way they are...

~Matt

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Tony Mechelynck

On 07/01/09 00:39, Matt Wozniski wrote:
 On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:
 On 06/01/09 12:31, anhnmncb wrote:
 Hi, list, as title, if so, why can't many functions
 still handle correctly with unicode? For example the func:

getline('.')[col('.')-1]

 Can't return a charactor outside the range of ascii.

 because string[index] returns a byte value, not a character value: see
 :help expr8.

 *Nod*

   If the character at the cursor is  U+007F, you'll get
 the first byte (in the range 0xC0-0xFD, or in practice in the range
 0xC0-0xF4) of its UTF-8 representation.

 No, you could get some byte of some entirely different character.  Ie,
 on a line with two 2-byte characters, getline('.')[col('.')-1] on the
 second character would return the 2nd byte of the first character.

col() gives a one-based byte ordinal. [] takes a zero-based argument. I 
stand by what I said.


 The _character_ at the cursor is obtained as follows:
 let i0 = byteidx(getline('.'), virtcol('.') - 1)
 let i1 = byteidx(getline('.'), virtcol('.'))
 let character = strpart(getline('.'), i0, i1 - 10)

 Using virtcol() there seems broken... what if you're in the middle of
 a tab, for example, with virtualedit=all?

 :echo join(split(áéíóú, '\zs')[1:3], '')

OK, I didn't think of virtual editing, nor even, it seems, of 
multi-column characters such as tabs and fullwidth CJK. However, [1:3] 
wouldn't work because the idea is that we're in a script, we don't know 
that we're in the 1st, 2nd or 3rd column, just that we want whatever is 
at the cursor. I might do it with

function CursorChar()
normal yl
return @@
endfunction


 is how I would do it... but, is there any real reason why indexing
 into a string *should* be byte oriented instead of character oriented,
 apart from backwards compatibility?  It seems drastically less easy to
 use the thing that more people want to use more of the time; and in
 fact some of the snippets in the vim help (like the example given at
 :help expr-8) won't work on multibyte lines given the way that string
 indexing works now.  It seems like a place where the cost of losing
 backwards compatibility might be outweighed by the cost of keeping
 things the way they are...

 ~Matt

Changing an existing construct from byte-oriented to 
multibyte-character-oriented would probably break a lot of existing 
scripts. I don't believe Bram would ever accept that.

Best regards,
Tony.
-- 
A programmer is a person who passes as an exacting expert on the basis
of being able to turn out, after innumerable punching, an infinite
series of incomprehensive answers calculated with micrometric
precisions from vague assumptions based on debatable figures taken from
inconclusive documents and carried out on instruments of problematical
accuracy by persons of dubious reliability and questionable mentality
for the avowed purpose of annoying and confounding a hopelessly
defenseless department that was unfortunate enough to ask for the
information in the first place.
-- IEEE Grid news magazine

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Matt Wozniski
On 1/6/09, Tony Mechelynck wrote:

  On 07/01/09 00:39, Matt Wozniski wrote:
   On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:
   On 06/01/09 12:31, anhnmncb wrote:
   Hi, list, as title, if so, why can't many functions
   still handle correctly with unicode? For example the func:
  
  getline('.')[col('.')-1]
  
   Can't return a charactor outside the range of ascii.
  
   because string[index] returns a byte value, not a character value: see
   :help expr8.
  
   *Nod*
  
 If the character at the cursor is  U+007F, you'll get
   the first byte (in the range 0xC0-0xFD, or in practice in the range
   0xC0-0xF4) of its UTF-8 representation.
  
   No, you could get some byte of some entirely different character.  Ie,
   on a line with two 2-byte characters, getline('.')[col('.')-1] on the
   second character would return the 2nd byte of the first character.

 col() gives a one-based byte ordinal. [] takes a zero-based argument. I
  stand by what I said.

Ooh, you're right - I forgot col() returned a byte index, and not the
column as its name would imply...

   The _character_ at the cursor is obtained as follows:
   let i0 = byteidx(getline('.'), virtcol('.') - 1)
   let i1 = byteidx(getline('.'), virtcol('.'))
   let character = strpart(getline('.'), i0, i1 - 10)
  
   Using virtcol() there seems broken... what if you're in the middle of
   a tab, for example, with virtualedit=all?
  
   :echo join(split(áéíóú, '\zs')[1:3], '')

 OK, I didn't think of virtual editing, nor even, it seems, of
  multi-column characters such as tabs and fullwidth CJK. However, [1:3]
  wouldn't work because the idea is that we're in a script, we don't know
  that we're in the 1st, 2nd or 3rd column, just that we want whatever is
  at the cursor. I might do it with

 function CursorChar()
 normal yl
 return @@
 endfunction

echo matchstr(getline('.'), '\%' . col('.') . 'c.')

does the same thing without clobbering the unnamed register...
slightly more elegant, imho.

   is how I would do it... but, is there any real reason why indexing
   into a string *should* be byte oriented instead of character oriented,
   apart from backwards compatibility?  It seems drastically less easy to
   use the thing that more people want to use more of the time; and in
   fact some of the snippets in the vim help (like the example given at
   :help expr-8) won't work on multibyte lines given the way that string
   indexing works now.  It seems like a place where the cost of losing
   backwards compatibility might be outweighed by the cost of keeping
   things the way they are...

 Changing an existing construct from byte-oriented to
  multibyte-character-oriented would probably break a lot of existing
  scripts. I don't believe Bram would ever accept that.

But sometimes, breaking things is required to make progress.  The fact
that we're having a conversation with both of us suggesting (fairly
complicated) things that haven't worked is a perfect proof for the
fact that the current system is counterintuitive and hard to use...

~Matt

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Tony Mechelynck

On 07/01/09 02:14, Matt Wozniski wrote:
 On 1/6/09, Tony Mechelynck wrote:
   On 07/01/09 00:39, Matt Wozniski wrote:
 On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:
 On 06/01/09 12:31, anhnmncb wrote:
 Hi, list, as title, if so, why can't many functions
 still handle correctly with unicode? For example the func:
   
getline('.')[col('.')-1]
   
 Can't return a charactor outside the range of ascii.
   
 because string[index] returns a byte value, not a character value: see
 :help expr8.
   
 *Nod*
   
   If the character at the cursor is   U+007F, you'll get
 the first byte (in the range 0xC0-0xFD, or in practice in the range
 0xC0-0xF4) of its UTF-8 representation.
   
 No, you could get some byte of some entirely different character.  Ie,
 on a line with two 2-byte characters, getline('.')[col('.')-1] on the
 second character would return the 2nd byte of the first character.

 col() gives a one-based byte ordinal. [] takes a zero-based argument. I
   stand by what I said.

 Ooh, you're right - I forgot col() returned a byte index, and not the
 column as its name would imply...

 The _character_ at the cursor is obtained as follows:
 let i0 = byteidx(getline('.'), virtcol('.') - 1)
 let i1 = byteidx(getline('.'), virtcol('.'))
 let character = strpart(getline('.'), i0, i1 - 10)
   
 Using virtcol() there seems broken... what if you're in the middle of
 a tab, for example, with virtualedit=all?
   
 :echo join(split(áéíóú, '\zs')[1:3], '')

 OK, I didn't think of virtual editing, nor even, it seems, of
   multi-column characters such as tabs and fullwidth CJK. However, [1:3]
   wouldn't work because the idea is that we're in a script, we don't know
   that we're in the 1st, 2nd or 3rd column, just that we want whatever is
   at the cursor. I might do it with

  function CursorChar()
  normal yl
  return @@
  endfunction

 echo matchstr(getline('.'), '\%' . col('.') . 'c.')

Again, col('.') is a byte index, not a column. What about virtcol('.') 
instead?

To avoid clobbering @@ I could save/restore it.


 does the same thing without clobbering the unnamed register...
 slightly more elegant, imho.

 is how I would do it... but, is there any real reason why indexing
 into a string *should* be byte oriented instead of character oriented,
 apart from backwards compatibility?  It seems drastically less easy to
 use the thing that more people want to use more of the time; and in
 fact some of the snippets in the vim help (like the example given at
 :help expr-8) won't work on multibyte lines given the way that string
 indexing works now.  It seems like a place where the cost of losing
 backwards compatibility might be outweighed by the cost of keeping
 things the way they are...

 Changing an existing construct from byte-oriented to
   multibyte-character-oriented would probably break a lot of existing
   scripts. I don't believe Bram would ever accept that.

 But sometimes, breaking things is required to make progress.  The fact
 that we're having a conversation with both of us suggesting (fairly
 complicated) things that haven't worked is a perfect proof for the
 fact that the current system is counterintuitive and hard to use...

 ~Matt

That's no reason for breaking what does work. I don't mind 
counterintuitive as long as it's documented.


Best regards,
Tony.
-- 
They told me you had proven it  When they discovered our results
About a month before.   Their hair began to curl
The proof was valid, more or less   Instead of understanding it
But rather less than more.  We'd run the thing through PRL.

He sent them word that we would try Don't tell a soul about all this
To pass where they had failed   For it must ever be
And after we were done, to them A secret, kept from all the rest
The new proof would be mailed.  Between yourself and me.

My notion was to start again
Ignoring all they'd done
We quickly turned it into code
To see if it would run.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Yue Wu

On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote:


 On 07/01/09 00:39, Matt Wozniski wrote:
 On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:
 On 06/01/09 12:31, anhnmncb wrote:
 Hi, list, as title, if so, why can't many functions
 still handle correctly with unicode? For example the func:

getline('.')[col('.')-1]

 Can't return a charactor outside the range of ascii.

 because string[index] returns a byte value, not a character value: see
 :help expr8.

 *Nod*

   If the character at the cursor is  U+007F, you'll get
 the first byte (in the range 0xC0-0xFD, or in practice in the range
 0xC0-0xF4) of its UTF-8 representation.

 No, you could get some byte of some entirely different character.  Ie,
 on a line with two 2-byte characters, getline('.')[col('.')-1] on the
 second character would return the 2nd byte of the first character.

 col() gives a one-based byte ordinal. [] takes a zero-based argument. I
 stand by what I said.


 The _character_ at the cursor is obtained as follows:
 let i0 = byteidx(getline('.'), virtcol('.') - 1)
 let i1 = byteidx(getline('.'), virtcol('.'))
 let character = strpart(getline('.'), i0, i1 - 10)

 Using virtcol() there seems broken... what if you're in the middle of
 a tab, for example, with virtualedit=all?

 :echo join(split(áéíóú, '\zs')[1:3], '')

 OK, I didn't think of virtual editing, nor even, it seems, of
 multi-column characters such as tabs and fullwidth CJK. However, [1:3]
 wouldn't work because the idea is that we're in a script, we don't know
 that we're in the 1st, 2nd or 3rd column, just that we want whatever is
 at the cursor. I might do it with

   function CursorChar()
   normal yl
   return @@
   endfunction


 is how I would do it... but, is there any real reason why indexing
 into a string *should* be byte oriented instead of character oriented,
 apart from backwards compatibility?  It seems drastically less easy to
 use the thing that more people want to use more of the time; and in
 fact some of the snippets in the vim help (like the example given at
 :help expr-8) won't work on multibyte lines given the way that string
 indexing works now.  It seems like a place where the cost of losing
 backwards compatibility might be outweighed by the cost of keeping
 things the way they are...

 ~Matt

 Changing an existing construct from byte-oriented to
 multibyte-character-oriented would probably break a lot of existing
 scripts. I don't believe Bram would ever accept that.

 Best regards,
 Tony.

Hmm, I think I got the point.

btw, I tested your func on a line with 测试(test)

let i0 = byteidx(getline('.'), virtcol('.') - 1)
let i1 = byteidx(getline('.'), virtcol('.'))
let character = strpart(getline('.'), i0, i1 - 10)

Then echo character got nothing.

-- 
Regards,
Van.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Tony Mechelynck

On 07/01/09 02:10, Yue Wu wrote:
 On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote:

 On 07/01/09 00:39, Matt Wozniski wrote:
 On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:
 On 06/01/09 12:31, anhnmncb wrote:
 Hi, list, as title, if so, why can't many functions
 still handle correctly with unicode? For example the func:

 getline('.')[col('.')-1]

 Can't return a charactor outside the range of ascii.

 because string[index] returns a byte value, not a character value: see
 :help expr8.
 *Nod*

If the character at the cursor is   U+007F, you'll get
 the first byte (in the range 0xC0-0xFD, or in practice in the range
 0xC0-0xF4) of its UTF-8 representation.
 No, you could get some byte of some entirely different character.  Ie,
 on a line with two 2-byte characters, getline('.')[col('.')-1] on the
 second character would return the 2nd byte of the first character.
 col() gives a one-based byte ordinal. [] takes a zero-based argument. I
 stand by what I said.

 The _character_ at the cursor is obtained as follows:
  let i0 = byteidx(getline('.'), virtcol('.') - 1)
  let i1 = byteidx(getline('.'), virtcol('.'))
  let character = strpart(getline('.'), i0, i1 - 10)
 Using virtcol() there seems broken... what if you're in the middle of
 a tab, for example, with virtualedit=all?

 :echo join(split(áéíóú, '\zs')[1:3], '')
 OK, I didn't think of virtual editing, nor even, it seems, of
 multi-column characters such as tabs and fullwidth CJK. However, [1:3]
 wouldn't work because the idea is that we're in a script, we don't know
 that we're in the 1st, 2nd or 3rd column, just that we want whatever is
 at the cursor. I might do it with

  function CursorChar()
  normal yl
  return @@
  endfunction

 is how I would do it... but, is there any real reason why indexing
 into a string *should* be byte oriented instead of character oriented,
 apart from backwards compatibility?  It seems drastically less easy to
 use the thing that more people want to use more of the time; and in
 fact some of the snippets in the vim help (like the example given at
 :help expr-8) won't work on multibyte lines given the way that string
 indexing works now.  It seems like a place where the cost of losing
 backwards compatibility might be outweighed by the cost of keeping
 things the way they are...

 ~Matt
 Changing an existing construct from byte-oriented to
 multibyte-character-oriented would probably break a lot of existing
 scripts. I don't believe Bram would ever accept that.

 Best regards,
 Tony.

 Hmm, I think I got the point.

 btw, I tested your func on a line with 测试(test)

   let i0 = byteidx(getline('.'), virtcol('.') - 1)
   let i1 = byteidx(getline('.'), virtcol('.'))
   let character = strpart(getline('.'), i0, i1 - 10)

 Then echo character got nothing.


Try the function in my next post. If you don't want to clobber the 
unnamed register, here is a variant:

function CursorChar()
let unnamed = @@
normal yl
let retval = @@
let @@ = unnamed
return retval
endfunction


Best regards,
Tony.
-- 
If you had any brains, you'd be dangerous.


Best regards,
Tony.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Yue Wu

On Wed, 07 Jan 2009 10:24:30 +0800, Tony Mechelynck wrote:


 On 07/01/09 02:10, Yue Wu wrote:
 On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote:

 On 07/01/09 00:39, Matt Wozniski wrote:
 On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:
 On 06/01/09 12:31, anhnmncb wrote:
 Hi, list, as title, if so, why can't many functions
 still handle correctly with unicode? For example the func:

 getline('.')[col('.')-1]

 Can't return a charactor outside the range of ascii.

 because string[index] returns a byte value, not a character value:  
 see
 :help expr8.
 *Nod*

If the character at the cursor is   U+007F, you'll get
 the first byte (in the range 0xC0-0xFD, or in practice in the range
 0xC0-0xF4) of its UTF-8 representation.
 No, you could get some byte of some entirely different character.  Ie,
 on a line with two 2-byte characters, getline('.')[col('.')-1] on the
 second character would return the 2nd byte of the first character.
 col() gives a one-based byte ordinal. [] takes a zero-based argument. I
 stand by what I said.

 The _character_ at the cursor is obtained as follows:
  let i0 = byteidx(getline('.'), virtcol('.') - 1)
  let i1 = byteidx(getline('.'), virtcol('.'))
  let character = strpart(getline('.'), i0, i1 - 10)
 Using virtcol() there seems broken... what if you're in the middle of
 a tab, for example, with virtualedit=all?

 :echo join(split(áéíóú, '\zs')[1:3], '')
 OK, I didn't think of virtual editing, nor even, it seems, of
 multi-column characters such as tabs and fullwidth CJK. However, [1:3]
 wouldn't work because the idea is that we're in a script, we don't know
 that we're in the 1st, 2nd or 3rd column, just that we want whatever  
 is
 at the cursor. I might do it with

 function CursorChar()
 normal yl
 return @@
 endfunction

 is how I would do it... but, is there any real reason why indexing
 into a string *should* be byte oriented instead of character oriented,
 apart from backwards compatibility?  It seems drastically less easy to
 use the thing that more people want to use more of the time; and in
 fact some of the snippets in the vim help (like the example given at
 :help expr-8) won't work on multibyte lines given the way that string
 indexing works now.  It seems like a place where the cost of losing
 backwards compatibility might be outweighed by the cost of keeping
 things the way they are...

 ~Matt
 Changing an existing construct from byte-oriented to
 multibyte-character-oriented would probably break a lot of existing
 scripts. I don't believe Bram would ever accept that.

 Best regards,
 Tony.

 Hmm, I think I got the point.

 btw, I tested your func on a line with 测试(test)

  let i0 = byteidx(getline('.'), virtcol('.') - 1)
  let i1 = byteidx(getline('.'), virtcol('.'))
  let character = strpart(getline('.'), i0, i1 - 10)

 Then echo character got nothing.


 Try the function in my next post. If you don't want to clobber the
 unnamed register, here is a variant:

   function CursorChar()
   let unnamed = @@
   normal yl
   let retval = @@
   let @@ = unnamed
   return retval
   endfunction

Yes, it works, but I don't like a function that contains normal
operators, I always think that a normal operator is only used for
normal mode by keyboard, if write a function, it's better to use
the function coressponding to the operator.

This version works fine:

matchstr(getline('.'), '\%' . col('.') . 'c.')

whereas this one doesn't:

matchstr(getline('.'), '\%' . virtcol('.') . 'c.')



 Best regards,
 Tony.



-- 
Regards,
Van.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Tony Mechelynck

On 07/01/09 03:38, Yue Wu wrote:
[...]
 I always think that a normal operator is only used for
 normal mode by keyboard,[...]

Oh? I have the opposite impression. For normal mode by keyboard, I don't use

:normal ylEnter

but

yl

To me, the :normal command is _only_ useful in scripts, in order to 
run in Ex mode the key sequences meant for Normal mode.


Best regards,
Tony.
-- 
If bankers can count, how come they have eight windows and only four
tellers?

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Yue Wu

On Wed, 07 Jan 2009 10:55:33 +0800, Tony Mechelynck wrote:


 On 07/01/09 03:38, Yue Wu wrote:
 [...]
 I always think that a normal operator is only used for
 normal mode by keyboard,[...]

 Oh? I have the opposite impression. For normal mode by keyboard, I don't  
 use

   :normal ylEnter

 but

   yl

 To me, the :normal command is _only_ useful in scripts, in order to
 run in Ex mode the key sequences meant for Normal mode.

I mean I prevent using yl from :normal if there is a function :yank :)

-- 
Regards,
Van.

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Tony Mechelynck

On 07/01/09 04:17, Yue Wu wrote:
 On Wed, 07 Jan 2009 10:55:33 +0800, Tony Mechelynck wrote:

 On 07/01/09 03:38, Yue Wu wrote:
 [...]
 I always think that a normal operator is only used for
 normal mode by keyboard,[...]
 Oh? I have the opposite impression. For normal mode by keyboard, I don't
 use

  :normal ylEnter

 but

  yl

 To me, the :normal command is _only_ useful in scripts, in order to
 run in Ex mode the key sequences meant for Normal mode.

 I mean I prevent using yl from :normal if there is a function :yank :)


There is a :yank command but it acts linewise. Here we want a 
characterwise yank, so we cannot use :yank.

The function you proposed is so complex I would run much more risk when 
trying to construct it than with :normal yl.

If the complexity is similar, I use the ex-command in scripts, for 
instance :wincmd k rather than :normal ^Wk where ^W would be 
obtained by hitting Ctrl-V followed by Ctrl-W.

Best regards,
Tony.
-- 
ARTHUR:  Shut up!  Will you shut up!
DENNIS:  Ah, now we see the violence inherent in the system.
ARTHUR:  Shut up!
DENNIS:  Oh!  Come and see the violence inherent in the system!
  HELP! HELP!  I'm being repressed!
   The Quest for the Holy Grail (Monty 
Python)

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---



Re: Is vim really fully unicoded?

2009-01-06 Fir de Conversatie Matt Wozniski

On 1/6/09, Tony Mechelynck wrote:
 On 1/6/09, Matt Wozniski wrote:
 echo matchstr(getline('.'), '\%' . col('.') . 'c.')

 Again, col('.') is a byte index, not a column. What about virtcol('.')
  instead?

Nope.  \%15c is also a byte index, not a column (which is also
counter-intuitive, and brings us back to the problem - that however
well documented it is, even experienced vimscript programmers get this
stuff wrong regularly.)

 Changing an existing construct from byte-oriented to
   multibyte-character-oriented would probably break a lot of existing
   scripts. I don't believe Bram would ever accept that.

 But sometimes, breaking things is required to make progress.  The fact
 that we're having a conversation with both of us suggesting (fairly
 complicated) things that haven't worked is a perfect proof for the
 fact that the current system is counterintuitive and hard to use...

 That's no reason for breaking what does work. I don't mind
  counterintuitive as long as it's documented.

See above.  If no one can remember how to use it, or the workarounds
to make it work are worth more trouble to the author than the trouble
of not having it work on multibyte input, I'd say that it _doesn't_
work as is.

In fact, I'd argue that having string indexing be byte-oriented after
multibyte was added was a regression that broke things that did work:
before, getline('.')[col('.')-1] was a valid way to get the character
under the cursor, and afterwards it was not.  Changing this behavior
would probably break very few scripts, since I doubt most scripters
are defensive about doing it correctly, and would mean that all the
broken code that already exists, and even the code that was written
before proper multibyte support was added (I believe it was added
after vimscript, right?), would continue to work *unless* it was
written intentionally to work around this issue.  And I think that
authors who knew enough to work around this would, by and large, be
happy to see it fixed.  I think that the advantages of having new
scripts work the way that they should, instead of the way that they
do, would greatly outweigh the disadvantages of breaking scripts
depending on the broken behavior.  But, Bram's opinion is the final
answer, so we'll see if he weighs in.

~Matt

--~--~-~--~~~---~--~~
You received this message from the vim_dev maillist.
For more information, visit http://www.vim.org/maillist.php
-~--~~~~--~~--~--~---