Hi,

Last week I looked into the Ticket [#18194][0]:

 - Trivial attempts to handle the issue.
 - Wrote a minor initial patch.
 - The test fails for Cache and Cookie backend.

Also, I looked at the talks from Paul regarding advanced security
topics at py/django cons. Realised that why I should not attempt
anything related to encryption in my project.

There is high academic pressure currently, so I am not able to give
enough time to these. I think the situation will be better this
weekend onwards.

I'll try to work on:

 - Write tests which emulate the problem in #18194 well, and then work
on the final fix.
 - Start looking into resources useful for my project, like [The
Tangled Web][1].

Rohan Jain


[0]: https://code.djangoproject.com/ticket/18194
[1]: http://www.amazon.com/The-Tangled-Web-Securing-Applications/dp/1593273886


On Fri, Apr 27, 2012 at 6:54 PM, Rohan Jain <crod...@gmail.com> wrote:
> Hi,
>
> I am Rohan Jain, a student from Indian Institute of Technology,
> Kharagpur. I'll be doing a Google Summer of Code project with django
> this year under the title "Security Enhancements". As the title
> suggests, it has something to do with Security Enhancements: like
> improvements in CSRF protection and tokenization.
>
> I have made some small updates to the proposal with the feedback it
> got. It is under VC over here: http://gist.github.com/2203174
> There isn't a direct way to diff gists, so here are the changes I did
> if somebody has already read the proposal:
>
>  - The origin check will be an additional step to ensure a valid
>   request and not standalone. The conventional checks will still
>   exist.
>
>  - Add some issues Luke pointed out about signing and using sessions.
>
>  - Add info about my github fork and branches.
>
> What I will be doing the following week:
>
>  - I haven't done any major contribution to django yet apart from a
>   tiny ticket some time ago. So, I'll be working on an ticket next
>   few weeks. It is related to filesystem backend of contrib.sessions,
>   was raised some time ago:
>   https://code.djangoproject.com/ticket/18194
>
>  - Cleanup and organize the proposal a bit more (Probably start
>   tracking it as the CSRF protection page -
>   https://code.djangoproject.com/wiki/CsrfProtection)
>
> (I have also appended the current proposal below in this post)
>
> --
> Rohan
>
>
> Proposal
> --------
>
> #Abstract
>
> Django is a reasonably secure framework. It provides an API and
> development patterns which transparently take care of the common web
> security issues. But still there are security features which need
> attention. I propose to work and improved CSRF checking without any
> compromises and on integration of existing work on centralized token
> system. If time permits I will also attempt on integration of
> django-secure.
>
> #Description
> ##CSRF Improvements
>
> Cross-Origin Resource Sharing (CORS):
> W3C has a working draft regarding [CORS][w3c-cors-draft], which opens
> up the possibility for allowing client-side request cross-origin
> requests. This directly triggers in mind the capability to develop
> API which can be exposed directly to the web browser. This would let
> us get rid of proxies and other hacks used to achieve this.
> Currently all the major browsers support this: Chrome (all versions),
> Firefox (> 3.0), IE (> 7.0), Safari (> 3.2), Opera (> 12.0). Firefox
> and Chrome send the origin header for both AJAX and standard from POST
> requests. Introduced it here as some further parts of the post refer
> to this.
>
> ###Origin checking
>
> With CORS around need for using CSRF token can be dropped, at least in
> some browsers. [Ticket #16859][orig-check-ticket], is an attempt for
> that. But this was rejected because of neglecting the case for
> presence of `CSRF_COOKE_DOMAIN` (Refer to the closing comment on the
> ticket for details). So to handle this we need to simulate checking of
> CSRF cookie domain as web browsers do it. Maybe:
>
> ```python
>
> reqest.META.get('HTTP_ORIGIN').endswith(settings.CSRF_COOKIE_DOMAIN)
>
> ```
>
> In case the server receives an origin header in the request, it will
> be used for an initial checking and then all the conventional checks
> will be done. The general security will automatically be improved with
> the increased market share of newer browsers which support Origin
> Header.
>
> As the closing comment points it out, we can't do this with secure
> requests. They need to be essentially checked against the referrer or
> origin, at least for now. We can not be sure that some untrusted or
> insecure subdomain has not already set the cookie or cookie domain.
> To deal with this, we have to consider https separately as it is
> being done now. So it will be something like:
>
> ```python
> def process_view(self, request, ....):
>
>    # Same initial setup
>
>    if request.method not in ('GET', 'HEAD', 'OPTIONS', 'TRACE'):
>
>        host = request.get_host()
>        origin = reqest.META.get('HTTP_ORIGIN', "")
>        cookie_domain = settings.CSRF_COOKIE_DOMAIN
>
>        if request.is_secure():
>            good_referer = 'https://%s/' % host
>            referer = origin or request.META.get('HTTP_REFERER')
>            # Do the same origin checks here
>
>        # We are insecure, so care less
>        # A better way for this check can be used if needed
>        elif origin.endswith(cookie_domain):
>            # Safe, continue conventional checking
>
>        # Do the conventional checks here
> ```
>
> If the above were to be implemented, the setting `CSRF_COOKIE_DOMAIN`
> should be deprecated for something like `CSRF_ALLOWED_DOMAIN` which
> makes more sense.
>
> ###Multiple Allowed Domains (was Better CORS Support)
> Since, already introducing Origin checking, we can go one step further
> and try to provide better support for CORS for browsers supporting it.
> A tuple/list setting, which specifies allowed domains will be
> provided. Using this the various access control allowance response
> headers will be set when the request origin is from amongst the
> allowed domains. For CSRF check, just see if http origin is an allowed
> domain.
>
> ```python
>
> def set_cors_headers(response, origin):
>    response['Access-Control-Allow-Origin'] = origin
>
> def process_response(self, request, response):
>
>    origin = reqest.META.get('HTTP_ORIGIN', "")
>
>    if origin in settings.CSRF_ALLOWED_DOMAINS:
>        set_cors_headers(response, origin)
>
> def process_request(self, request, response):
>
>    # Use origin in settings.CSRF_ALLOWED_DOMAINS here instead of
>    # origin.endswith
>
> ```
>
> Probably, something similar to the above will be needed to incorporate
> the CORS support.
>
> ###Less restrictive secure requests
>
> The current CSRF system is pretty much secure as it is. But CSRF
> protection poses too much restriction to https. It says no to all the
> request, without honouring any tokens. It kind of has to, thanks to
> the way browsers allow cookie access. A cookie accessible through
> subdomains mean that any subdomain secure or insecure can set the CSRF
> token, which could be really serious for the site security. To get
> around this, currently one has to completely exempt views from CSRF
> and may or may not handle CSRF attacks. This can be dangerous. Also if
> a person has a set of sites, which talk to each other through clients
> and decides to run it over https, it would need some modifications.
>
> Django should behave under https similarly as it does under http
> without compromising any security. So, we need to make sure that the
> CSRF token is always set by a trusted site. Signing the data with the
> same key, probably `settings.SECRET_KEY`, across the sites looks apt
> for this, using `django.core.signing`. We can have `get_token` and
> `set_token` methods which abstract the signing process.
> This can be done in two ways:
>
>  - Store CSRF data in sessions data in case `contrib.sessions` is
>   installed. Then the data will automatically be signed with the
>   secret key or will not be stored in the client as cookies at all.
>
>  - In case of it being absent from installed apps, revert to custom
>   signing
>
>  - Encryption?
>
> ```python
> from django.core.signing import TimestampSigner
>
> signer = TimestampSigner("csrf-token")
> CSRF_COOKIE_MAX_AGE = 60 * 60 * 24 * 7 * 52
>
>
> def get_unsigned_token(request):
>    # BadSignature exception needs to be handled somewhere
>    return signer.unsign(request.META.get("CSRF_COOKIE", None)
>                         max_age = CSRF_COOKIE_MAX_AGE)
>
> def set_signed_token(response, token):
>    response.set_cookie(settings.CSRF_COOKIE_NAME,
>                        signer.sign(request.META["CSRF_COOKIE"]),
>                        max_age = CSRF_COOKIE_MAX_AGE,
>                        domain=settings.CSRF_COOKIE_DOMAIN,
>                        path=settings.CSRF_COOKIE_PATH,
>                        secure=settings.CSRF_COOKIE_SECURE
>                        )
>
>
> def get_token(request):
>    if 'django.contrib.sessions' in settings.INSTALLED_APPS:
>        return request.session.csrf_token
>    else:
>        return get_unsigned_token(request)
>
> def set_token(response, token)
>    if 'django.contrib.sessions' in settings.INSTALLED_APPS:
>        request.session.csrf_token = token
>    else:
>        set_signed_token(response, token)
>
> # Comparing to the token in the request
> constant_time_compare(request_csrf_token, get_token(csrf_token))
>
> ```
>
> Now, doing this is not as simple as the above code block makes it
> look. There is a lot which can and probably will go wrong with this
> approach:
>
>  - Even when the token is signed, other domains can completely replace
>   the CSRF token cookie, it won't grant them access through CSRF
>   check though. Even with signing, they just need to replay an
>   existing good token/cookie pair, which they can get directly from
>   the server any time they want.
>
>  - This sort of couples CSRF with sessions, a contrib app. Currently
>   nothing except some of the other contrib apps are tied up with
>   sessions. It will break if sessions were to be removed in future or
>   the API changed. Also, this means that if one website is using
>   sessions CSRF, all of the other must be too. It would actually kind
>   of be a step because of the coupling.
>
>  - If this were successfully implemented, is this exposing any
>   critical security flaws otherwise? Will it cause compatibility
>   issues?
>
>  - Encryption itself comes with its own issues. It will need high
>   considerations.
>
> As Paul McMillan said "This is a hard problem", I'll delegate figuring
> this to future me. I will look into [The Tangled Web][tangled-web]
> and [Google's Browser Security Handbook][gobrowsersec] for ideas,
> again suggested by Paul on the IRC.
>
> ##Centralized tokenization
> There are multiple places in django which use some or other kinds of
> tokens:
>
>  - contirb.auth (random password, password reset)
>  - formtools
>  - session (backends)
>  - cache
>  - csrf
>  - etags
>
> Token generation is pretty common around the framework.  So, instead
> of each application having its own token system, and hence needs to be
> maintained separately. There should be centralized token system, which
> provides an abstract API for everyone to loose. In fact, I have seen
> that some apps use `User.objects.make_random_password` from
> contrib.auth, which they can be sure of being maintained in the future
> for random generation. To me this looks kind of weird.
> In last djangocon, a lot of work regarding this was done over [Yarko's
> Fork][yarko-fork].
>
> I had a discussion with Yarko Tymciurak regarding this. The work is
> nearly ready for a merge, only some tasks left. I can work over these
> to insure that the already done significant work gets in django and is
> updated for 1.5.
>
>  - Porting more stuff to the new system (README.sec in
>   [yarko's fork][yarko-fork])
>  - Testing - See if the current coverage of the tests is enough, write
>   them if not.
>  - Compatibility issues
>  - API Documentation
>
> I will study the changes done at djangocon and then attempt the tasks
> mentioned above.
>
>
> ##Integrating django-secure
> A really useful app for catching security configuration related
> mistakes is [carljm's django-secure][djang-secure]. It is specially
> useful to find out issues that might have been introduced while quick
> changes to settings for development. This project is popular and
> useful enough that it can be shipped with django. I haven't been able
> give this enough time yet. I can think of two ways of integrating
> this:
>
>  - Dropping it as a contrib app
>   This seems pretty straight forward would require minimal amount of
>   changes.
>
>  - Distribute around the framework:
>   Like CSRF, this can also be distributed framework wide and hence it
>   won't be optional to have. Apps can still define custom checks in
>   the same way when `django-secure` was installed as a pluggable
>   application.
>
> The app might also need some changes whilst being integrated:
>
>  - More security checks, if required
>  - Adjust according to the changes introduced above.
>
> #Plan
> I think that the tasks CSRF enhancements and centralized tokenization
> will be enough to span through the SoC period. If after a thorough
> implementation and testing of these, I still have time, django-secure
> integration can be looked into.
>
>
> Roughly this proposal can span over a maximum of 5 tasks. Each task
> will generally have the following steps:
>
>  a. Initial Research. Design decisions
>  b. Implementation with minor parallel tests.
>  c. Thorough and regression testing to to achieve security quality.
>  d. Configuration/Settings changes and handle compatibility issues.
>  e. Documentation.
>
> Tasks (with most effort requiring steps in parenthesis):
>
>  1. Origin Checking (b, c)
>  2. Multiple Allowed Domains (b, c)
>  3. Less restrictive CSRF checking over HTTPS / CORS for HTTPS (a, b)
>  4. Unified Tokenization (a,c,e)
>  5. Integration of django-secure (d,e)
>
> I'll be using [my fork of django][gh-fork] over github. I'll probably
> use the following branch names:
> csrf-enhancements (origin checking, multiple request domains etc)
> centralized-tokenization (djangocon2011-sec)
>
> ##Timeline
> Week 1: Task 1.a, 1.b.
> Week 2: Task 1.c, 1.d
> Week 3: Task 2.a, 2.b. Start task 3.a
> Week 4: Task 2.c, 2.d
> Week 5: Task 1.e, 2.e (Doing these together might be beneficial)
> Week 6-7: Complete 3.a. Task 3.b
> Week 7-9: Task 3.c, 3.d
> Week 10: Task 3.e
> Week 11-12: Tasks 4.abcde (max possible)
> Week 13: Complete Task 4 and maybe Max of Task 5
>
> *I am sorry for writing these as if written by a bot, the deadline was
> so close so had to adopt this method*.
>
> [yarko-fork]: https://github.com/yarko/django
> [w3c-cors-draft]: http://www.w3.org/TR/access-control/
> [orig-check-ticket]: https://code.djangoproject.com/ticket/16859
> [tangled-web]: 
> http://www.amazon.com/The-Tangled-Web-Securing-Applications/dp/1593273886/
> [gobrowsersec]: http://code.google.com/p/browsersec/wiki/Main
> [django-secure]: https://github.com/carljm/django-secure
> [gh-fork]: https://github.com/crodjer/django

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to