On Wed, Mar 3, 2010 at 10:19 AM, Manuel Pégourié-Gonnard <[email protected]> wrote: > luigi scarso a écrit : >>> this discussion is IMO whether unicode.* libraries are a replacement for >>> string or not. >> Hm. >> A difficult question. >> > IMO not. The comments state that unicode.ascii and unicode.latin1 are > locale-independent replacements for string, but doens't say anything about > unicode.utf8, and that's probably for a reason. But as Taco, said, this would > be > best discussed with the selene developpers.
My point it's not about this implementation but to keep separate semantic of string.* and unicode.* In lua string module cover 0x0 to 0xff --- it's octet oriented, and it's in Lua core . The name "string" will be always in "conflict" with any unicode.* implementation --- there are no unicode module in Lua core actually because ansi C . Selene implementation resolves this "conflict" in a precise manner: its C code it's not so long to check & understand . Someone agree and someother no, but it's not a bug --- we have not a buggy luatex, this is important. >> Can we implement an acceptable wrapper ? >> > Here it is again, now in the form of a function: > > function find_utf8_chars(str, pat) > local a, b = unicode.utf8.find(str, pat) > a = unicode.utf8.len(string.sub(str, 1, a)) > b = unicode.utf8.len(string.sub(str, 1, b)) > return a, b > end For example here I disagree because you mix string and unicode.utf8 --- but it's my first look, I should check. Maybe it's the only way to resolve the problem. Anyway I don't consider this a waste of time. -- luigi
