#14652: Sessions seem to be improperly using Pickle to hash a dictionary
----------------------------------------------+-----------------------------
          Reporter:  PaulM                    |         Owner:  nobody
            Status:  closed                   |     Milestone:  1.3   
         Component:  django.contrib.sessions  |       Version:  1.2   
        Resolution:  invalid                  |      Keywords:        
             Stage:  Unreviewed               |     Has_patch:  0     
        Needs_docs:  0                        |   Needs_tests:  0     
Needs_better_patch:  0                        |  
----------------------------------------------+-----------------------------
Comment (by lukeplant):

 Replying to [comment:2 PaulM]:
 > A differently ordered but valid pickle of the same data will still
 produce a different MAC

 True

 > and so fail our check.

 False, because we don't compare the pickle of the new data to the pickle
 of the old data (or the hashes of those pickles).

 It simply does not matter that pickle doesn't necessarily produce the same
 results twice.  As far as the MAC step is concerned, the pickled string is
 simply an opaque string that needs signing.  The origin of that string,
 and what process is used to generate it, is completely irrelevant as far
 as HMAC is concerned. We are storing an opaque string and the HMAC of that
 string, and neither can 'change' - or rather the point is that if either
 'changes' (by someone tampering with the data) we will know about it.

 Once we've checked the integrity of our opaque string, we then go on to
 unpickle it. At this point, the '''only''' thing required is that
 `pickle.loads` will correctly load the data.  At this point, the
 `pickle.dumps()` of the restored data might be completely different, but
 that makes no difference.  We would have a problem if we had code that did
 this:

 {{{
 #!python
 restored_data = pickle.loads(saved_data)
 if MAC(pickle.dumps(restored_data)) != saved_hash:
     return {}
 else:
     return restored_data
 }}}

 But we don't. Instead we do this:

 {{{
 #!python
             hash, pickled = encoded_data.split(':', 1)
             expected_hash = self._hash(pickled)
             if not constant_time_compare(hash, expected_hash):
                 raise SuspiciousOperation("Session data corrupted")
             else:
                 return pickle.loads(pickled)
 }}}


 Our process requires the following properties:

  1. `pickle.dumps` can be used to serialise our values to a string
  2. HMAC can be used to verify the integrity and authenticity of a string
  3. `pickle.loads` can be used to restore the value produced by 1. to the
 same data originally stored (where 'same' means Python equality, `==`, not
 `is`).

 All of these are satisfied, and further properties of pickle are not
 relevant. Tim's e-mail is about a completely different process - in which
 someone was effectively trying to take the pickle of two values and use
 the pickled string to compare those values.  This fails because
 `pickle.dumps` is multi-valued in its output i.e. the same input can
 produce different output.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/14652#comment:3>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to