Re: [BUG] 'non-empty string' >? '' returns false on amd64 arch

Tony Mechelynck Tue, 24 May 2011 18:49:21 -0700

On 25/05/11 02:56, Ivan Krasilnikov wrote:

Also mb_strnicmp() assumes that lowercase and uppercase characters
have the same length in UTF-8 representation. This isn't the case.
Here are a few counterexamples:


$ python -c 'print " ".join(["0x%.2X" % n for n in range(65536) if
len(unichr(n).encode("utf8")) !=
len(unichr(n).lower().encode("utf8"))])'

0x130 0x23A 0x23E 0x1E9E 0x2126 0x212A 0x212B 0x2C62 0x2C64 0x2C6D 0x2C6E 0x2C6F

So I think the UTF-8 part of mb_strncimp() needs to be completely rewritten.

Yes, and in Turkish (i.e. with ":lang ctype tr" and 'casemap' empty), Iand i (1 byte each) have as respective case-counterparts ı and İ (2bytes each).



Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
94. Now admit it... How many of you have made "modem noises" into
    the phone just to see if it was possible? :-)

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Re: [BUG] 'non-empty string' >? '' returns false on amd64 arch

Reply via email to