I had the chance to chat with Rasmus Lerdorf a few weeks ago, and one
of the topics that came up was input filtering for web application
security. Rasmus knows a heck of a lot about this stuff (don't let
PHP's past mistakes let you think otherwise) and described the
following scheme which I think Django could make good use of.
Basically, instead of accessing data from GET and POST directly,
applications use utility functions that filter depending on what the
application is asking for. Say you want to get an integer that
someone has entered. In current Django, you might do this:
a = int(request.GET['a'])
With smart input filtering, you would do something like this instead:
a = request.GET.as_int('a')
Functions like this can be created for all kinds of data. Here are a
few examples off the top of my head:
a = request.GET.as_email('email')
a = request.GET.as_float('f')
a = request.GET.as_safe_html('body')
This is great for people who use them, but what about developers who
lack the discipline to do so? The proposed solution is to strip /
anything/ that is potentially harmful from all input unless expressly
told otherwise. Consider the following input data:
This has <script>alert('scary')</script> looking code in it as well
as some \00 null bytes and other weird escape characters \\'''';
DELETE FROM pages;
Accessed through the regular method it would automatically have
potential nasties stripped:
>>> text = request.GET['body']
>>> text
This has alert(scary) looking code in it as well as some null bytes
and other weird escape characters DELETE from pages
(I don't know if stripping should be this aggressive; this is just an
example)
If you want the raw data without stripping applied, you do this:
text = request.GET.as_raw('body')
The idea is to make developers have to go out of their way to avoid
input filtering. This is certainly something that would greatly
benefit the world of PHP. Developers who use Django may like to think
themselves above such mistakes, but mistakes are easy to make.
Functionality like this, even if limited to the as_email etc. input
filters rather than filtering everything by default, would make it a
lot harder to mistakenly create insecure apps.
Cheers,
Simon