On 12/12/2013 1:30 PM, Arthur Reutenauer wrote:
(I didn't try it, I promise.)
if match("à","%s") then
print("space")
else
print("not a space")
end
à is U+00E0, which is 0xc3 0xa0 in UTF-8, and U+00A0 is the
unbreakable space, so I'm guessing we get "space".
if match("á","%s") then
print("space")
else
print("not a space")
end
The two UTF-8 bytes are 0xc3 0xa2 here, and I can't remember off hand
what 0xa2, but I'm guessing "not a space".
luigi and i are looking into it and it looks like some 'wrong increment
when no match' kind of bug, so we end up in the middle of a multibyte
utf sequence
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------