Am 04.03.2010 21:24, schrieb Khaled Hosny:
On Thu, Mar 04, 2010 at 08:08:56PM +0000, Jonathan Fine wrote:
Yes, Stephan, they do start at zero.  So for Unicode in Lua, Python
would be a good example to study and perhaps follow.

Actually, python<  3.0 is a horrible mess Unicode-wise and I'd never
ever try to follow it. Python is my favorite programming language, but
I'd never call its Unicode support a "good example".

But I think Jonathan's Python example can act as an indicator that slnunicode does things wrong[1], or at least it doesn't comply to conventions even if its behaviour is documented.

In Python len and index always keep consistent about what the length of the string is, whether they treat it as a UTF-8 string or a byte sequence.

'\xc3\xa4b' äb 3 2
'\xc3\xb6\xc3\xa4b' öäb 5 4
u'\xe4b' äb 2 1
u'\xf6\xe4b' öäb 3 2

Best regards,
Stephan Hennig

[1] I know that a sample of one language might be a bit weak an argument. But I have yet to see a language supporting slnunicode's behaviour. Or a use-case, that doesn't qualify as a programming error.

Reply via email to