On 19/12/2025 02:51, Thomas Passin wrote:
On 12/18/2025 1:34 PM, Gregg Drennan wrote:
No chatbot involved. I was typing this on my phone in bed last night
and didn't have a Python interpreter handy that I could verify this
with. I will certainly check this.
I apologise. I thought your post sounded chatbotty, especially at the
end. I have to admit I only tested with punctuation, not a wide range
of unicode characters, but heck that's what your post mentioned.
Apparently if you want to use casefold() to compare characters, you
should call it on both, and not assume it will just give you
lowercase. There are only a few, very few, characters where casefold()
and lower() give different results. Rare, but the German ß is one of
them. If you program using a German keyboard you probably know this
(I don't and didn't).
Strictly speaking, you might also need to normalise the strings
(`unicodedata.normalize`) because, for example, "é" can be 1 codepoint
("LATIN SMALL LETTER E WITH ACUTE") or 2 codepoints ("LATIN SMALL LETTER
E" followed by "COMBINING ACUTE ACCENT").
[snip]
--
https://mail.python.org/mailman3//lists/python-list.python.org