Re: [Python-Dev] Python-3.0, unicode, and os.environ

Glenn Linderman Sun, 07 Dec 2008 18:23:54 -0800

On approximately 12/7/2008 10:56 AM, came the following characters fromthe keyboard of Adam Olsen:

You might receive a UTF-8 encoded file name from a malicious user,
check if it contains something dangerous (like
"../../../../../etc/password"), then decode it.  If your decoder isn't
compliant (ie doesn't check for overly long sequences) then a
b'\xC0\xAF' gets translated into u'/', bypassing your previous check.



You might indeed.

But if you are interested in checking for security issues, shouldn't you_first_ decode into some canonical form, specifying what sorts ofUnicode strictness (such as overlong sequences) to check for during thedecode process, and once the string is in canonical form, _then_ dochecks for various attacks, such as the ../ sequence you mention?

And with that order of operation, even if you don't reject overlongsequences, you have canonized them, and can recognize the resultingcharacters as good or bad.



--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

Reply via email to