Out of all the approximately thousand bazillion ways to write obfuscated 
Python code, which may or may not be malicious, why are Unicode 
confusables worth this level of angst and concern?

I looked up "Unicode homoglyph" on CVE, and found a grand total of seven 
hits:

https://www.cvedetails.com/google-search-results.php?q=unicode+homoglyph

all of which appear to be related to impersonation of account names. I 
daresay if I expanded my search terms, I would probably find some more, 
but it is clear that Unicode homoglyphs are not exactly a major threat.

In my opinion, the other Steve's (Stestagg) example of obfuscated code 
with homoglyphs for e (as well as a few similar cases, such as 
homoglyphs for A) mostly makes for an amusing curiosity, perhaps worth a 
plugin for Pylint and other static checkers, but not much more. I'm not 
entirely sure what Paul's more lurid examples are supposed to indicate. 
If your threat relies on a malicious coder smuggling in identifiers like 
"๐š‘๐“ฎ๐–‘๐’๐‘œ" or "ยชยบ" and having the reader not notice, then I'm not going to 
lose much sleep over it.

Confusable account names and URL spoofing are proven, genuine threats. 
Beyond that, IMO the actual threat window from confusables is pretty 
small. Yes, you can write obfuscated code, and smuggle in calls to 
unexpected functions:

    result = lะตn(sequence)  # Cyrillic letter small Ie

but you still have to smuggle in a function to make it work:

    def lะตn(obj):
        # something malicious

And if you can do that, the Unicode letter is redundant. I'm not sure 
why any attacker would bother.


-- 
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XNRW6JSFGO4DQOGVNY2FEZAUBN6P2HRR/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to