index(sprit("こんにちわ世界", "\zs"), "世") should return 5

On 3/30/14, Dmitry Frank <[email protected]> wrote:
> Then, how can I get the symbol index (not byte offset) of a match?
>
> There is awesome plugin "PreciseJump":
> http://www.vim.org/scripts/script.php?script_id=3437 , it gets array of all
> symbols of the line, like that:
>
>         let lines_with_markers[l] = split(getline(l), '\zs')
>
> So that here's symbol index is needed, not byte offset. Because of this, it
> fails if there's multi-byte chars before the match.
>
> Currently, it's calculated like this:
>
>         let match_start = match(getline(l), a:re, 0, 1)
>
> Please suggest how to get symbol index instead.
>
>
>
> 2014-03-30 4:03 GMT+04:00 Nikolay Pavlov <[email protected]>:
>
>>
>> On Mar 30, 2014 3:35 AM, "Dmitry Frank" <[email protected]> wrote:
>> >
>> > Hello all.
>> >
>> > match() function returns index of first match, but if there are
>> multi-byte chars before first match, then each multi-byte chars is
>> interpreted as several chars, so, index becomes wrong.
>> >
>> > Say, match("foobar", "bar") returns 3, which is correct.  But
>> match("яfoobar", "bar")  returns 5, which is wrong (should be 4)
>>
>> This is completely correct. What are you going to do with 4? "яfoobar"[4]
>> is "o" (specifically, second one).
>>
>> >
>> > Notice: in the latter example above, I've inserted russian letter 'я',
>> which is multi-byte in utf-8.
>> >
>> > It happens when &encoding is "utf-8".I've also tested it in windows, on
>> russian locale there's &encoding "cp1251", then match() works correctly
>> with russian chars. So, it depends on &encoding.
>>
>> match() returns *byte offset*. Obviously with a single-byte encoding one
>> character always occupies one byte.
>>
>> >
>> > But we surely need to make match() work as expected when &encoding is
>> "utf-8" too.
>>
>> Also col(), string indexing /\%Nc and so on? Not going to happen, this is
>> incompatible change.
>>
>> >
>> > --
>> > Regards,
>> > Dmitry
>> >
>> > --
>> > --
>> > You received this message from the "vim_dev" maillist.
>> > Do not top-post! Type your reply below the text you are replying to.
>> > For more information, visit http://www.vim.org/maillist.php
>> >
>> > ---
>> > You received this message because you are subscribed to the Google
>> Groups "vim_dev" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an email to [email protected].
>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> --
>> You received this message from the "vim_dev" maillist.
>> Do not top-post! Type your reply below the text you are replying to.
>> For more information, visit http://www.vim.org/maillist.php
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "vim_dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups
> "vim_dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>


-- 
- Yasuhiro Matsumoto

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Raspunde prin e-mail lui