On Sun, 17 Feb 2013, ND wrote: > It's one more thing. I don't sure that this algorithm (keeping start with > ovector[0]-max_lookbehind) will work right in unicode case. I don't strongly > understand what will be happen. In documentation I read than max_lookbehind > returns symbols (not bytes) amount.
Correct. It returns a character (not byte) count, because this is what is needed when doing a lookbehind. Just subtracting it from the starting point is no good; you have to "walk" back along the string for that many characters in the Unicode case. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
