On Wed, 07 Jan 2009 10:24:30 +0800, Tony Mechelynck wrote: > > On 07/01/09 02:10, Yue Wu wrote: >> On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote: >> >>> On 07/01/09 00:39, Matt Wozniski wrote: >>>> On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: >>>>> On 06/01/09 12:31, anhnmncb wrote: >>>>>> Hi, list, as title, if so, why can't many functions >>>>>> still handle correctly with unicode? For example the func: >>>>>> >>>>>> getline('.')[col('.')-1] >>>>>> >>>>>> Can't return a charactor outside the range of ascii. >>>>>> >>>>> because string[index] returns a byte value, not a character value: >>>>> see >>>>> ":help expr8". >>>> *Nod* >>>> >>>>> If the character at the cursor is> U+007F, you'll get >>>>> the first byte (in the range 0xC0-0xFD, or in practice in the range >>>>> 0xC0-0xF4) of its UTF-8 representation. >>>> No, you could get some byte of some entirely different character. Ie, >>>> on a line with two 2-byte characters, getline('.')[col('.')-1] on the >>>> second character would return the 2nd byte of the first character. >>> col() gives a one-based byte ordinal. [] takes a zero-based argument. I >>> stand by what I said. >>> >>>>> The _character_ at the cursor is obtained as follows: >>>>> let i0 = byteidx(getline('.'), virtcol('.') - 1) >>>>> let i1 = byteidx(getline('.'), virtcol('.')) >>>>> let character = strpart(getline('.'), i0, i1 - 10) >>>> Using virtcol() there seems broken... what if you're in the middle of >>>> a tab, for example, with virtualedit=all? >>>> >>>> :echo join(split("áéíóú", '\zs')[1:3], '') >>> OK, I didn't think of virtual editing, nor even, it seems, of >>> multi-column characters such as tabs and fullwidth CJK. However, [1:3] >>> wouldn't work because the idea is that we're in a script, we don't know >>> that we're in the 1st, 2nd or 3rd column, just that we want "whatever >>> is >>> at the cursor". I might do it with >>> >>> function CursorChar() >>> normal yl >>> return @@ >>> endfunction >>> >>>> is how I would do it... but, is there any real reason why indexing >>>> into a string *should* be byte oriented instead of character oriented, >>>> apart from backwards compatibility? It seems drastically less easy to >>>> use the thing that more people want to use more of the time; and in >>>> fact some of the snippets in the vim help (like the example given at >>>> :help expr-8) won't work on multibyte lines given the way that string >>>> indexing works now. It seems like a place where the cost of losing >>>> backwards compatibility might be outweighed by the cost of keeping >>>> things the way they are... >>>> >>>> ~Matt >>> Changing an existing construct from byte-oriented to >>> multibyte-character-oriented would probably break a lot of existing >>> scripts. I don't believe Bram would ever accept that. >>> >>> Best regards, >>> Tony. >> >> Hmm, I think I got the point. >> >> btw, I tested your func on a line with "测试"(test) >> >> let i0 = byteidx(getline('.'), virtcol('.') - 1) >> let i1 = byteidx(getline('.'), virtcol('.')) >> let character = strpart(getline('.'), i0, i1 - 10) >> >> Then echo character got nothing. >> > > Try the function in my next post. If you don't want to clobber the > unnamed register, here is a variant: > > function CursorChar() > let unnamed = @@ > normal yl > let retval = @@ > let @@ = unnamed > return retval > endfunction
Yes, it works, but I don't like a function that contains normal operators, I always think that a normal operator is only used for normal mode by keyboard, if write a function, it's better to use the function coressponding to the operator. This version works fine: matchstr(getline('.'), '\%' . col('.') . 'c.') whereas this one doesn't: matchstr(getline('.'), '\%' . virtcol('.') . 'c.') > > > Best regards, > Tony. -- Regards, Van. --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---