On 07/01/09 02:10, Yue Wu wrote: > On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote: > >> On 07/01/09 00:39, Matt Wozniski wrote: >>> On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote: >>>> On 06/01/09 12:31, anhnmncb wrote: >>>>> Hi, list, as title, if so, why can't many functions >>>>> still handle correctly with unicode? For example the func: >>>>> >>>>> getline('.')[col('.')-1] >>>>> >>>>> Can't return a charactor outside the range of ascii. >>>>> >>>> because string[index] returns a byte value, not a character value: see >>>> ":help expr8". >>> *Nod* >>> >>>> If the character at the cursor is> U+007F, you'll get >>>> the first byte (in the range 0xC0-0xFD, or in practice in the range >>>> 0xC0-0xF4) of its UTF-8 representation. >>> No, you could get some byte of some entirely different character. Ie, >>> on a line with two 2-byte characters, getline('.')[col('.')-1] on the >>> second character would return the 2nd byte of the first character. >> col() gives a one-based byte ordinal. [] takes a zero-based argument. I >> stand by what I said. >> >>>> The _character_ at the cursor is obtained as follows: >>>> let i0 = byteidx(getline('.'), virtcol('.') - 1) >>>> let i1 = byteidx(getline('.'), virtcol('.')) >>>> let character = strpart(getline('.'), i0, i1 - 10) >>> Using virtcol() there seems broken... what if you're in the middle of >>> a tab, for example, with virtualedit=all? >>> >>> :echo join(split("áéíóú", '\zs')[1:3], '') >> OK, I didn't think of virtual editing, nor even, it seems, of >> multi-column characters such as tabs and fullwidth CJK. However, [1:3] >> wouldn't work because the idea is that we're in a script, we don't know >> that we're in the 1st, 2nd or 3rd column, just that we want "whatever is >> at the cursor". I might do it with >> >> function CursorChar() >> normal yl >> return @@ >> endfunction >> >>> is how I would do it... but, is there any real reason why indexing >>> into a string *should* be byte oriented instead of character oriented, >>> apart from backwards compatibility? It seems drastically less easy to >>> use the thing that more people want to use more of the time; and in >>> fact some of the snippets in the vim help (like the example given at >>> :help expr-8) won't work on multibyte lines given the way that string >>> indexing works now. It seems like a place where the cost of losing >>> backwards compatibility might be outweighed by the cost of keeping >>> things the way they are... >>> >>> ~Matt >> Changing an existing construct from byte-oriented to >> multibyte-character-oriented would probably break a lot of existing >> scripts. I don't believe Bram would ever accept that. >> >> Best regards, >> Tony. > > Hmm, I think I got the point. > > btw, I tested your func on a line with "测试"(test) > > let i0 = byteidx(getline('.'), virtcol('.') - 1) > let i1 = byteidx(getline('.'), virtcol('.')) > let character = strpart(getline('.'), i0, i1 - 10) > > Then echo character got nothing. >
Try the function in my next post. If you don't want to clobber the unnamed register, here is a variant: function CursorChar() let unnamed = @@ normal yl let retval = @@ let @@ = unnamed return retval endfunction Best regards, Tony. -- If you had any brains, you'd be dangerous. Best regards, Tony. --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---