Am 02.03.2010 14:41, schrieb luigi scarso:
On Tue, Mar 2, 2010 at 2:01 PM, Stephan Hennig<[email protected]>  wrote:
The output of

  str = "abcde"
  print(unicode.utf8.match(str, "()e"))
  str = "Äabcde"
  print(unicode.utf8.match(str, "()e"))

is 5 and 7.  The second one is obviously wrong.
I believe 7 is ok, because in utf8 Äabcde is 7 octet long
and  unittest.c says
  NOTE: find positions are in bytes for all ctypes!

Logicians might be satisfied with broken behaviour as long as it's documented. But I'm not a logician, so I cannot agree. :)

Best regards,
Stephan Hennig

Reply via email to