[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

Chris Angelico Mon, 15 Nov 2021 03:47:29 -0800

On Mon, Nov 15, 2021 at 10:22 PM Abdur-Rahmaan Janhangeer
<arj.pyt...@gmail.com> wrote:
>
> Greetings,
>
>
> > Now what happens? where do you go from there to a vunerability or
> backdoor? I think it might be a bit obvious that there is something
> funny going on if I see:
>
>     if (user.admin == "root" and check_password_securely()
>             or user.admin == "root"
>             # Second string has hidden characters, do not remove it.
>             ):
>         elevate_privileges()
>
>
> Well, it's not so obvious. From Ross Anderson and Nicholas Boucher
> src: https://trojansource.codes/trojan-source.pdf
>
> See appendix H. for Python.
>
> with implementations:
>
> https://github.com/nickboucher/trojan-source/tree/main/Python
>
> Rely precisely on bidirectional control chars and/or replacing look alikes


The point of those kinds of attacks is that syntax highlighters and
related code review tools would misinterpret them. So I pulled them
all up in both GitHub's view and the editor I personally use (SciTE,
albeit a fairly old version now). GitHub specifically flags it as a
possible exploit in a couple of cases, but also syntax highlights the
return keyword appropriately. SciTE doesn't give any sort of warnings,
but again, correctly highlights the code - early-return shows "return"
as a keyword, invisible-function shows the name "is_" as the function
name and the rest not, homoglyph-function shows a quite
distinct-looking letter that definitely isn't an H.

The problems here are not Python's, they are code reviewers', and that
means they're really attacks against the code review tools. It's no
different from using the variable m in one place and rn in another,
and hoping that code review uses a proportionally-spaced font that
makes those look similar. So to count as a viable attack, there needs
to be at least one tool that misparses these; so far, I haven't found
one, but if I do, wouldn't it be more appropriate to raise the bug
report against the tool?

> > There is no reason why linters and code checkers shouldn't check for
> invisible characters, Unicode confusables or mixed script identifiers
> and flag them. The interpreter shouldn't concern itself with such purely
> stylistic issues unless there is a concrete threat that can only be
> handled by the interpreter itself.
>
>
> I mean current linters. But it will be good to check those for sure.
> As a programmer, i don't want a language which bans unicode stuffs.
> If there's something that should be fixed, it's the unicode standard, maybe
> defining a sane mode where weird unicode stuffs are not allowed. Can also
> be from language side in the event where it's not being considered in the 
> standard
> itself.

Uhhm..... "weird unicode stuffs"? Please clarify.

> I don't see it as a language fault nor as a client fault as they are 
> considering
> the unicode docs but the response was mixed with some languages decided to 
> patch it
> from their side, some linters implementing detection for it as well as some 
> editors flagging
> it and rendering it as the exploit intended.

I see it as an editor issue (or code review tool, as the case may be).
You'd be hard-pressed to get something past code review if it looks to
everyone else like you slipped a "return" statement at the end of a
docstring.

So far, I've seen fewer problems from "weird unicode stuffs" than from
the quoted-printable encoding, and that's an attack that involves
nothing but ASCII text. It's also an attack that far more code review
tools seem to be vulnerable to.

ChrisA
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OUPC6LGFXIILBTNEC4FYTERBX7VKQHDX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

Reply via email to