03.11.21 14:31, Petr Viktorin пише: > For example: should the parser emit a lightweight audit event if it > finds a non-ASCII identifier? (See below for why ASCII is special.) > Or for encoding declarations?
There are audit events for import and compile. You can also register import hooks if you want more fanny preprocessing than just unicode-encoding. I do not think we need to add more specific audit events, they were not designed for this. And I think it is too late to detect suspicious code at the time of its execution. It should be detected before adding that code to the code base (review tools, pre-commit hooks). > I don't think this would actually ban Cyrillic/Greek. > (My suggestion is not vanilla confusables detection; it might require > careful reading: "should there be a [linter] warning when an identifier > looks like ASCII but isn't?") Yes, but it should be optional and configurable and not be the part of the Python compiler. This is not our business as Python core developers. > I am not a native speaker, but I did try a bit to find an actual > ASCII-like word in a language that uses Cyrillic. I didn't succeed; I > think they might be very rare. With simple script I have found 62 words common between English and Ukrainian: гасу/racy, горе/rope, рима/puma, міх/mix, etc. But there are much more English and Ukrainian words which contains only letters which can be confused with letters from other script. And identifiers can contains abbreviations and shortening, they are not all can be found in dictionaries. > Even if there was such a word -- or a one-letter abbreviation used as a > variable name -- it would be confusing to use. Removing the possibility > of confusion could *help* Cyrillic users. (I can't speak for them; this > is just a brainstorming idea.) I never used non-Latin identifiers in Python, but I guess that where they are used (in schools?) there is a mix of English and non-English identifiers, and identifiers consisting of parts of English and non-English words without even an underscore between them. I know because in other languages they just use inconsistent transliteration. Emitting any warning by default is a discrimination of non-English users. It would be better to not add support of non-ASCII identifiers at first place. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XHHXRWGKTDTZIYGS6AB3DKEVFH5D6BHV/ Code of Conduct: http://python.org/psf/codeofconduct/