Re: Is vim really fully unicoded?

Yue Wu Tue, 06 Jan 2009 18:38:58 -0800

On Wed, 07 Jan 2009 10:24:30 +0800, Tony Mechelynck wrote:

>
> On 07/01/09 02:10, Yue Wu wrote:
>> On Wed, 07 Jan 2009 08:25:35 +0800, Tony Mechelynck wrote:
>>
>>> On 07/01/09 00:39, Matt Wozniski wrote:
>>>> On Tue, Jan 6, 2009 at 6:10 PM, Tony Mechelynck wrote:
>>>>> On 06/01/09 12:31, anhnmncb wrote:
>>>>>> Hi, list, as title, if so, why can't many functions
>>>>>> still handle correctly with unicode? For example the func:
>>>>>>
>>>>>>         getline('.')[col('.')-1]
>>>>>>
>>>>>> Can't return a charactor outside the range of ascii.
>>>>>>
>>>>> because string[index] returns a byte value, not a character value:  
>>>>> see
>>>>> ":help expr8".
>>>> *Nod*
>>>>
>>>>>    If the character at the cursor is>   U+007F, you'll get
>>>>> the first byte (in the range 0xC0-0xFD, or in practice in the range
>>>>> 0xC0-0xF4) of its UTF-8 representation.
>>>> No, you could get some byte of some entirely different character.  Ie,
>>>> on a line with two 2-byte characters, getline('.')[col('.')-1] on the
>>>> second character would return the 2nd byte of the first character.
>>> col() gives a one-based byte ordinal. [] takes a zero-based argument. I
>>> stand by what I said.
>>>
>>>>> The _character_ at the cursor is obtained as follows:
>>>>>          let i0 = byteidx(getline('.'), virtcol('.') - 1)
>>>>>          let i1 = byteidx(getline('.'), virtcol('.'))
>>>>>          let character = strpart(getline('.'), i0, i1 - 10)
>>>> Using virtcol() there seems broken... what if you're in the middle of
>>>> a tab, for example, with virtualedit=all?
>>>>
>>>> :echo join(split("áéíóú", '\zs')[1:3], '')
>>> OK, I didn't think of virtual editing, nor even, it seems, of
>>> multi-column characters such as tabs and fullwidth CJK. However, [1:3]
>>> wouldn't work because the idea is that we're in a script, we don't know
>>> that we're in the 1st, 2nd or 3rd column, just that we want "whatever  
>>> is
>>> at the cursor". I might do it with
>>>
>>>     function CursorChar()
>>>             normal yl
>>>             return @@
>>>     endfunction
>>>
>>>> is how I would do it... but, is there any real reason why indexing
>>>> into a string *should* be byte oriented instead of character oriented,
>>>> apart from backwards compatibility?  It seems drastically less easy to
>>>> use the thing that more people want to use more of the time; and in
>>>> fact some of the snippets in the vim help (like the example given at
>>>> :help expr-8) won't work on multibyte lines given the way that string
>>>> indexing works now.  It seems like a place where the cost of losing
>>>> backwards compatibility might be outweighed by the cost of keeping
>>>> things the way they are...
>>>>
>>>> ~Matt
>>> Changing an existing construct from byte-oriented to
>>> multibyte-character-oriented would probably break a lot of existing
>>> scripts. I don't believe Bram would ever accept that.
>>>
>>> Best regards,
>>> Tony.
>>
>> Hmm, I think I got the point.
>>
>> btw, I tested your func on a line with "测试"(test)
>>
>>      let i0 = byteidx(getline('.'), virtcol('.') - 1)
>>      let i1 = byteidx(getline('.'), virtcol('.'))
>>      let character = strpart(getline('.'), i0, i1 - 10)
>>
>> Then echo character got nothing.
>>
>
> Try the function in my next post. If you don't want to clobber the
> unnamed register, here is a variant:
>
>       function CursorChar()
>               let unnamed = @@
>               normal yl
>               let retval = @@
>               let @@ = unnamed
>               return retval
>       endfunction


Yes, it works, but I don't like a function that contains normal
operators, I always think that a normal operator is only used for
normal mode by keyboard, if write a function, it's better to use
the function coressponding to the operator.

This version works fine:

        matchstr(getline('.'), '\%' . col('.') . 'c.')

whereas this one doesn't:

        matchstr(getline('.'), '\%' . virtcol('.') . 'c.')

>
>
> Best regards,
> Tony.



-- 
Regards,
Van.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Re: Is vim really fully unicoded?

Raspunde prin e-mail lui