On 24Dec2022 09:11, Chris Angelico <ros...@gmail.com> wrote:
On Sat, 24 Dec 2022 at 09:07, Cameron Simpson <c...@cskk.id.au> wrote:
On 23Dec2022 22:27, Chris Angelico <ros...@gmail.com> wrote:
>I think this would be a useful feature to have, although it'll
>probably end up needing a LOT of information (you can't just say "give
>me a locale-correct uppercasing of this string" without further
>context). So IMO it should be third-party.
It would probably be good to have a caveat mentioning these context
difficulties in the docs of the unicodedata and str/string case fiddling
methods. Not a complete exposition, but making it clear that for some
languages the rules require context, maybe with a
hard-to-implement-correctly example of naive/incorrect use.
Do people actually read those warnings?
I have read them, I think, though not for a while.
Hang on, lemme pop into the time machine and add one to the docstring
and docs for str.upper(). Okay, I'm back. Tell me, have you read the
docstring?
Python 3.9.13 (main, Aug 11 2022, 14:01:42)
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> help(str.upper)
Help on method_descriptor:
upper(self, /)
Return a copy of the string converted to uppercase.
Hmm. Did you commit the change? Is the key to the time machine back on
its hook?
Docs:
str.upper()
Return a copy of the string with all the cased characters 4
converted to uppercase. Note that s.upper().isupper() might be
False if s contains uncased characters or if the Unicode
category of the resulting character(s) is not “Lu” (Letter,
uppercase), but e.g. “Lt” (Letter, titlecase).
The uppercasing algorithm used is described in section 3.13 of
the Unicode Standard.
and [4] here:
Cased characters are those with general category property being one
of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt”
(Letter, titlecase).
wording that clarifies whether x.upper() uppercases the string
in-place?
Well, it says "a copy", so I'd say it's clear.
I've only got version 5.0 of Unicode here. [steps into the other
room...] Thank you, I see you used the time machine to buy me version
9.0 too :-)
Ah, 3.13 is 7 pages of compact text here.
I was thinking of something a bit more general, like "case changing is a
complex language and context dependent process, and use of str.upper
(etc....) therefore perform a simplistic operation".
Cheers,
Cameron Simpson <c...@cskk.id.au>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/D47A4NQKHP4LBWM4B6J3XELBFVKN5DX6/
Code of Conduct: http://python.org/psf/codeofconduct/