On Sat, Apr 10, 2021 at 12:15 AM Paul Bryan <[email protected]> wrote:
>
> This sounds more like a Unicode thing than a generic string thing. And, in 
> Uncode, Greek characters are included in multiple groupings. Searching for 
> "Theta" to see what we get:
>
> Greek and Coptic:
> U+0398 GREEK CAPITAL LETTER THETA
> U+03B8 GREEK SMALL LETTER THETA
> U+03D1 GREEK THETA SYMBOL
> U+03F4 GREEK CAPITAL THETA SYMBOL
>
> Phonetic Extensions Supplement:
> U+1DBF MODIFIER LETTER SMALL THETA
>
> Mathematical Alphanumeric Symbols:
> U+1D6AF MATHEMATICAL BOLD CAPITAL THETA
> U+1D6B9 MATHEMATICAL BOLD CAPITAL THETA SYMBOL
> U+1D6C9 MATHEMATICAL BOLD SMALL THETA
> (... 17 more Thetas in this group! ...)
>
> If you were to pick a definitive set of Greek characters for your use case, 
> would it be in the Mathematical Alphanumeric Symbols category? Would others' 
> expected use of Greek characters match yours, or would it need to be 
> inclusive of all Greek characters across groupings?
>
> I'm beginning to sense a metal container containing wriggly things...
>

But I think you've also nailed the correct solution. Python comes with
[1] a unicodedata module, which would be the best way to define these
sorts of sets. It's a tad messy to try to gather the correct elements
though, so maybe the best way to do this would be a
unicodedata.search() function that returns a string of all characters
with a particular string in their names, or something like that.

ChrisA

[1] technically, CPython and many other implementations come with, but
there are some (eg uPy) that don't
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/5MRAFMNZQ27DDAA7ZRD2E55OAFKWD734/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to