Am 03.03.2010 10:19, schrieb Manuel Pégourié-Gonnard:
luigi scarso a écrit :

Can we implement an acceptable  wrapper  ?

Yes, an proper wrapper has already been given by Patrick [1] and quoted by
myself. Here it is again, now in the form of a function:

function find_utf8_chars(str, pat)
     local a, b = unicode.utf8.find(str, pat)
     a = unicode.utf8.len(string.sub(str, 1, a))
     b = unicode.utf8.len(string.sub(str, 1, b))
     return a, b
end

  > [...]

[1] http://tug.org/pipermail/luatex/2010-March/001262.html

My original problem has already been solved by the function posted in my second mail.[1] Here's a slightly modified version:

function utf8_find(str, pattern, init)
    local s = unicode.utf8.sub(str, init)
    -- search for first occurrence of pattern
    s = unicode.utf8.match(s, "^.-" .. pattern)
    -- calculate end point of match
    local e = s and init + unicode.utf8.len(s) - 1
    -- calculate beginning of match
    local b = e and e - unicode.utf8.len(pattern) + 1
    -- return indices of found match, or nil
    return b, e
end

It works similar, but uses match instead of find. Although, Patrick's approach could be a bit faster than mine, both won't perform well, since they

   * build temporary strings and

   * have to iterate over strings several times (find/match, sub, len).

A native C implementation would probably be significantly faster than a Lua implementation. Slnunicode developers decided not to provide such thing. I can't imagine, why.

In my personal utf8_find, I think I'll use both Lua solutions and check for differences of the find/match approaches for the sake of robustness (until I get confident upon unicode.utf8.find again).

Best regards,
Stephan Hennig

[1] <URL:http://permalink.gmane.org/gmane.comp.tex.luatex.user/1182>

Reply via email to