On Mar 30, 2014 3:35 AM, "Dmitry Frank" <[email protected]> wrote:
>
> Hello all.
>
> match() function returns index of first match, but if there are
multi-byte chars before first match, then each multi-byte chars is
interpreted as several chars, so, index becomes wrong.
>
> Say, match("foobar", "bar") returns 3, which is correct.  But
match("яfoobar", "bar")  returns 5, which is wrong (should be 4)

This is completely correct. What are you going to do with 4? "яfoobar"[4]
is "o" (specifically, second one).

>
> Notice: in the latter example above, I've inserted russian letter 'я',
which is multi-byte in utf-8.
>
> It happens when &encoding is "utf-8".I've also tested it in windows, on
russian locale there's &encoding "cp1251", then match() works correctly
with russian chars. So, it depends on &encoding.

match() returns *byte offset*. Obviously with a single-byte encoding one
character always occupies one byte.

>
> But we surely need to make match() work as expected when &encoding is
"utf-8" too.

Also col(), string indexing /\%Nc and so on? Not going to happen, this is
incompatible change.

>
> --
> Regards,
> Dmitry
>
> --
> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups
"vim_dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Raspunde prin e-mail lui