On Wed, Mar 3, 2010 at 5:18 PM, Stephan Hennig <[email protected]> wrote: > > My original problem has already been solved by the function posted in my > second mail.[1] Here's a slightly modified version: > > function utf8_find(str, pattern, init) > local s = unicode.utf8.sub(str, init) > -- search for first occurrence of pattern > s = unicode.utf8.match(s, "^.-" .. pattern) > -- calculate end point of match > local e = s and init + unicode.utf8.len(s) - 1 > -- calculate beginning of match > local b = e and e - unicode.utf8.len(pattern) + 1 > -- return indices of found match, or nil > return b, e > end
I like this one because it's not mixed with string.*: we all know that unicode.utf8.match operate in "octect point of view mode " and unicode.utf8.len in "unicode point of view mode" . -- luigi
