On 12/12/2013 1:30 PM, Arthur Reutenauer wrote:

   (I didn't try it, I promise.)

if match("à","%s") then
     print("space")
else
     print("not a space")
end

   à is U+00E0, which is 0xc3 0xa0 in UTF-8, and U+00A0 is the
unbreakable space, so I'm guessing we get "space".

if match("á","%s") then
     print("space")
else
     print("not a space")
end

   The two UTF-8 bytes are 0xc3 0xa2 here, and I can't remember off hand
what 0xa2, but I'm guessing "not a space".

luigi and i are looking into it and it looks like some 'wrong increment when no match' kind of bug, so we end up in the middle of a multibyte utf sequence

Hans


-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
    tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

Reply via email to