Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-19 Thread Stephen J. Turnbull
Antoine Pitrou writes:

  That's unsubstantiated.

Sure.  If I had a CVE, I would have posted it.

  Give an example of how sorted URLs compromise security.

That's not how you think about security; the right question about
sorted URLs is how do you know that they *don't* compromise
security?  We know that mishandling URLs *can* compromise security
(eg, via bugs in directory traversal).

But you know that.  What you presumably mean here is why do you think
randomly changing query parameter order in URLs is more secure than
sorted order?  The answer to that is that since the server can't
depend on order, it *must* handle more configurations of parameters by
design (and presumably in implementation and testing), and therefore
will be robust against more kinds of parameter configurations.  Eg,
there will be no temptation to optimize processing by handling
parameters in sorted order.

Is this a real danger?  Maybe not.  But every unnecessary regularity
in inputs that a program's implementation depends on is a potential
attack vector via irregular inputs.

Remember, I was responding to a claim that sorted order is *always*
better.  That's a dangerous kind of claim to make about anything that
could be input to an Internet server.

Steve
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-19 Thread Stephen J. Turnbull
Joao S. O. Bueno writes:

  Ageeded that any way one thinks about it is far too strong a claim -
  but I still hold to the point. Maybe most ways one thinks about it
  :-)  .

100% agreement now.wink/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-19 Thread Antoine Pitrou
On Sun, 19 Aug 2012 20:55:31 +0900
Stephen J. Turnbull step...@xemacs.org wrote:
 Antoine Pitrou writes:
 
   That's unsubstantiated.
 
 Sure.  If I had a CVE, I would have posted it.

Ok, so you have no evidence.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-19 Thread Senthil Kumaran
On Sat, Aug 18, 2012 at 1:55 PM, Glenn Linderman v+pyt...@g.nevcal.com wrote:

 On 8/18/2012 11:47 AM, MRAB wrote:

 I vote -0. The issue can also be addressed with a small and simple
 helper function that wraps urlparse and compares the query parameter. Or
 you cann urlencode() with `sorted(qs.items)` instead of `qs` in the
 application.


 Hm. That's actually a good point.


 Seems adequate to me. Most programs wouldn't care about the order, because 
 most web frameworks grab whatever is there in whatever order, and present it 
 to the web app in their own order.

 Programs that care, or which talk to web apps that care, are unlikely to want 
 the order from a non-randomized dict, and so have already taken care of 
 ordering issues, so undoing the randomization seems like a solution in search 
 of a problem (other than for poorly written test cases).



I am of the same thought too. Changing a behavior based on the test
case expectation, no matter if the behavior is a harmless change is
still a change. Coming to the point testing query string could be
useful in some cases and then giving weightage to the change seems
interesting use case, but does not seem to warrant a change. I think,
I like Christian Heimes suggestion that a wrapper to compare query
strings would be useful and in Guido's original test case, a tittle
test code change would have been good.

Looks like Guido has withdrawn the bug report too.

-- 
Senthil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread Peter Otten
Guido van Rossum wrote:

 I wonder if it wouldn't make sense to change urlencode() to generate
 URLs that don't depend on the hash order, for all versions of Python
 that support PYTHONHASHSEED? It seems a one-line fix:
 
 query = query.items()
 
 with this:
 
 query = sorted(query.items())
 
 This would not prevent breakage of unit tests, but it would make a
 much simpler fix possible: simply sort the parameters in the URL.
 
 Thoughts?

There may be people who mix bytes and str or pass other non-str keys:

 query = {ba:bb, c:d, 5:6}
 urlencode(query)
'a=bc=d5=6'
 sorted(query.items())
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: unorderable types: str()  bytes()

Not pretty, but a bugfix should not break such constructs.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread Antoine Pitrou
On Sat, 18 Aug 2012 14:23:13 +0900
Stephen J. Turnbull step...@xemacs.org wrote:
 Joao S. O. Bueno writes:
 
   I don't think this behavior is only desirable to unit tests: having
   URL's been formed in predictable way  a good thing in any way one
   thinks about it.
 
 Especially if you're a hacker.  One more thing you may be able to use
 against careless sites that don't expect the unexpected to occur in
 URLs.

That's unsubstantiated. Give an example of how sorted URLs compromise
security.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread Joao S. O. Bueno
On 18 August 2012 02:23, Stephen J. Turnbull step...@xemacs.org wrote:
 Joao S. O. Bueno writes:

   I don't think this behavior is only desirable to unit tests: having
   URL's been formed in predictable way  a good thing in any way one
   thinks about it.

 Especially if you're a hacker.  One more thing you may be able to use
 against careless sites that don't expect the unexpected to occur in
 URLs.

 I'm not saying this is a bad thing, but we should remember that the
 whole point of PYTHONHASHSEED is that regularities can be exploited
 for devious and malicious purposes, and reducing regularity makes many
 attacks more difficult.  *Any* way one thinks about it is far too
 strong a claim.

Ageeded that any way one thinks about it is far too strong a claim -
but I still hold to the point. Maybe most ways one thinks about it
:-)  .



 Steve




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread Christian Heimes
Am 17.08.2012 21:27, schrieb Guido van Rossum:
 I wonder if it wouldn't make sense to change urlencode() to generate
 URLs that don't depend on the hash order, for all versions of Python
 that support PYTHONHASHSEED? It seems a one-line fix:
 
 query = query.items()
 
 with this:
 
 query = sorted(query.items())
 
 This would not prevent breakage of unit tests, but it would make a
 much simpler fix possible: simply sort the parameters in the URL.

I vote -0. The issue can also be addressed with a small and simple
helper function that wraps urlparse and compares the query parameter. Or
you cann urlencode() with `sorted(qs.items)` instead of `qs` in the
application.

The order of query string parameter is actually important for some
applications, for example Zope, colander+deform and other form
frameworks use the parameter order to group parameters.

Therefore I propose that the query string is only sorted when the query
is exactly a dict and not some subclass or class that has an items() method.

if type(query) is dict:
query = sorted(query.items())
else:
query = query.items()

Christian

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread Guido van Rossum
On Sat, Aug 18, 2012 at 6:28 AM, Christian Heimes li...@cheimes.de wrote:
 Am 17.08.2012 21:27, schrieb Guido van Rossum:
 I wonder if it wouldn't make sense to change urlencode() to generate
 URLs that don't depend on the hash order, for all versions of Python
 that support PYTHONHASHSEED? It seems a one-line fix:

 query = query.items()

 with this:

 query = sorted(query.items())

 This would not prevent breakage of unit tests, but it would make a
 much simpler fix possible: simply sort the parameters in the URL.

 I vote -0. The issue can also be addressed with a small and simple
 helper function that wraps urlparse and compares the query parameter. Or
 you cann urlencode() with `sorted(qs.items)` instead of `qs` in the
 application.

Hm. That's actually a good point.

 The order of query string parameter is actually important for some
 applications, for example Zope, colander+deform and other form
 frameworks use the parameter order to group parameters.

 Therefore I propose that the query string is only sorted when the query
 is exactly a dict and not some subclass or class that has an items() method.

 if type(query) is dict:
 query = sorted(query.items())
 else:
 query = query.items()

That's already in the bug I filed. :-) I also added that the sort may
fail if the keys mix e.g. bytes and str (or int and str, for that
matter).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread MRAB

On 18/08/2012 18:34, Guido van Rossum wrote:

On Sat, Aug 18, 2012 at 6:28 AM, Christian Heimes li...@cheimes.de wrote:

Am 17.08.2012 21:27, schrieb Guido van Rossum:

I wonder if it wouldn't make sense to change urlencode() to generate
URLs that don't depend on the hash order, for all versions of Python
that support PYTHONHASHSEED? It seems a one-line fix:

query = query.items()

with this:

query = sorted(query.items())

This would not prevent breakage of unit tests, but it would make a
much simpler fix possible: simply sort the parameters in the URL.


I vote -0. The issue can also be addressed with a small and simple
helper function that wraps urlparse and compares the query parameter. Or
you cann urlencode() with `sorted(qs.items)` instead of `qs` in the
application.


Hm. That's actually a good point.


The order of query string parameter is actually important for some
applications, for example Zope, colander+deform and other form
frameworks use the parameter order to group parameters.

Therefore I propose that the query string is only sorted when the query
is exactly a dict and not some subclass or class that has an items() method.

if type(query) is dict:
query = sorted(query.items())
else:
query = query.items()


That's already in the bug I filed. :-) I also added that the sort may
fail if the keys mix e.g. bytes and str (or int and str, for that
matter).


One possible way around that is to add the class names, perhaps only if
sorting raises an exception:

def make_key(pair):
return type(pair[0]).__name__, type(pair[1]).__name__, pair

if type(query) is dict:
try:
query = sorted(query.items())
except TypeError:
query = sorted(query.items(), key=make_key)
else:
query = query.items()

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread Guido van Rossum
On Saturday, August 18, 2012, MRAB wrote:

 On 18/08/2012 18:34, Guido van Rossum wrote:

 On Sat, Aug 18, 2012 at 6:28 AM, Christian Heimes li...@cheimes.de
 wrote:

 Am 17.08.2012 21:27, schrieb Guido van Rossum:

 I wonder if it wouldn't make sense to change urlencode() to generate
 URLs that don't depend on the hash order, for all versions of Python
 that support PYTHONHASHSEED? It seems a one-line fix:

 query = query.items()

 with this:

 query = sorted(query.items())

 This would not prevent breakage of unit tests, but it would make a
 much simpler fix possible: simply sort the parameters in the URL.


 I vote -0. The issue can also be addressed with a small and simple
 helper function that wraps urlparse and compares the query parameter. Or
 you cann urlencode() with `sorted(qs.items)` instead of `qs` in the
 application.


 Hm. That's actually a good point.

  The order of query string parameter is actually important for some
 applications, for example Zope, colander+deform and other form
 frameworks use the parameter order to group parameters.

 Therefore I propose that the query string is only sorted when the query
 is exactly a dict and not some subclass or class that has an items()
 method.

 if type(query) is dict:
 query = sorted(query.items())
 else:
 query = query.items()


 That's already in the bug I filed. :-) I also added that the sort may
 fail if the keys mix e.g. bytes and str (or int and str, for that
 matter).

  One possible way around that is to add the class names, perhaps only if
 sorting raises an exception:

 def make_key(pair):
 return type(pair[0]).__name__, type(pair[1]).__name__, pair

 if type(query) is dict:
 try:
 query = sorted(query.items())
 except TypeError:
 query = sorted(query.items(), key=make_key)
 else:
 query = query.items()


Doesn't strike me as necessary.


 __**_
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
 guido%40python.orghttp://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
Sent from Gmail Mobile
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-18 Thread Glenn Linderman

On 8/18/2012 11:47 AM, MRAB wrote:

I vote -0. The issue can also be addressed with a small and simple
helper function that wraps urlparse and compares the query parameter. Or
you cann urlencode() with `sorted(qs.items)` instead of `qs` in the
application.


Hm. That's actually a good point. 


Seems adequate to me. Most programs wouldn't care about the order, 
because most web frameworks grab whatever is there in whatever order, 
and present it to the web app in their own order.


Programs that care, or which talk to web apps that care, are unlikely to 
want the order from a non-randomized dict, and so have already taken 
care of ordering issues, so undoing the randomization seems like a 
solution in search of a problem (other than for poorly written test cases).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-17 Thread Guido van Rossum
I just fixed a unittest for some code used at Google that was
comparing a url generated by urllib.encode() to a fixed string. The
problem was caused by turning on PYTHONHASHSEED=1. Because of this,
the code under test would generate a textually different URL each time
the test was run, but the intention of the test was just to check that
all the query parameters were present and equal to the expected
values.

The solution was somewhat painful, I had to parse the url, split the
query parameters, and compare them to a known dict.

I wonder if it wouldn't make sense to change urlencode() to generate
URLs that don't depend on the hash order, for all versions of Python
that support PYTHONHASHSEED? It seems a one-line fix:

query = query.items()

with this:

query = sorted(query.items())

This would not prevent breakage of unit tests, but it would make a
much simpler fix possible: simply sort the parameters in the URL.

Thoughts?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-17 Thread Daniel Holth
Only if it is not an OrderedDict

 query = sorted(query.items())

 This would not prevent breakage of unit tests, but it would make a
 much simpler fix possible: simply sort the parameters in the URL.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-17 Thread Martin v. Löwis
On 17.08.2012 21:27, Guido van Rossum wrote:
 query = sorted(query.items())

 This would not prevent breakage of unit tests, but it would make a
 much simpler fix possible: simply sort the parameters in the URL.

 Thoughts?

Sounds good. For best backwards compatibility, I'd restrict the sorting
to the exact dict type, since people may be using non-dict mappings
which already have a different stable order.

 for all versions of Python that support PYTHONHASHSEED?

I think this cannot be done, in particular not for 2.6 and 3.1 - it's
not a security fix (*).

Strictly speaking, it isn't even a bug fix, since it doesn't restore
the original behavior that some people (like your test case) relied
on. In particular, if somebody has fixed PYTHONHASHSEED to get a stable
order, this change would break such installations. By that policy, it
could only go into 3.4.

OTOH, if it also checked whether there is randomized hashing, and sort
only in that case, I think it should be backwards compatible in all
interesting cases.

Regards,
Martin

(*) I guess some may claim that the current implementation leaks
some bits of the hash seed, since you can learn the seed from
the parameter order, so sorting would make it more secure. However,
I would disagree that this constitutes a feasible threat.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-17 Thread Guido van Rossum
Thanks, I filed http://bugs.python.org/issue15719 to track this.

On Fri, Aug 17, 2012 at 12:50 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 On 17.08.2012 21:27, Guido van Rossum wrote:
 query = sorted(query.items())

 This would not prevent breakage of unit tests, but it would make a
 much simpler fix possible: simply sort the parameters in the URL.

 Thoughts?

 Sounds good. For best backwards compatibility, I'd restrict the sorting
 to the exact dict type, since people may be using non-dict mappings
 which already have a different stable order.

 for all versions of Python that support PYTHONHASHSEED?

 I think this cannot be done, in particular not for 2.6 and 3.1 - it's
 not a security fix (*).

 Strictly speaking, it isn't even a bug fix, since it doesn't restore
 the original behavior that some people (like your test case) relied
 on. In particular, if somebody has fixed PYTHONHASHSEED to get a stable
 order, this change would break such installations. By that policy, it
 could only go into 3.4.

 OTOH, if it also checked whether there is randomized hashing, and sort
 only in that case, I think it should be backwards compatible in all
 interesting cases.

 Regards,
 Martin

 (*) I guess some may claim that the current implementation leaks
 some bits of the hash seed, since you can learn the seed from
 the parameter order, so sorting would make it more secure. However,
 I would disagree that this constitutes a feasible threat.



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-17 Thread Chris Angelico
On Sat, Aug 18, 2012 at 5:27 AM, Guido van Rossum gu...@python.org wrote:
 I just fixed a unittest for some code used at Google that was
 comparing a url generated by urllib.encode() to a fixed string. The
 problem was caused by turning on PYTHONHASHSEED=1. Because of this,
 the code under test would generate a textually different URL each time
 the test was run, but the intention of the test was just to check that
 all the query parameters were present and equal to the expected
 values.


 query = sorted(query.items())

Hmm. ISTM this is putting priority on the unit test above the
functionality of actual usage. Although on the other hand, sorting
parameters on a URL is nothing compared to the cost of network
traffic, so it's unlikely to be significant.

Chris Angelico
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-17 Thread Joao S. O. Bueno
On 17 August 2012 18:52, Chris Angelico ros...@gmail.com wrote:
 On Sat, Aug 18, 2012 at 5:27 AM, Guido van Rossum gu...@python.org wrote:
 I just fixed a unittest for some code used at Google that was
 comparing a url generated by urllib.encode() to a fixed string. The
 problem was caused by turning on PYTHONHASHSEED=1. Because of this,
 the code under test would generate a textually different URL each time
 the test was run, but the intention of the test was just to check that
 all the query parameters were present and equal to the expected
 values.


 query = sorted(query.items())

 Hmm. ISTM this is putting priority on the unit test above the
 functionality of actual usage. Although on the other hand, sorting
 parameters on a URL is nothing compared to the cost of network
 traffic, so it's unlikely to be significant.


I don't think this behavior is only desirable to unit tests: having
URL's been formed in predictable way  a good thing in any way one
thinks about it.

   js
  --

 Chris Angelico

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?

2012-08-17 Thread Stephen J. Turnbull
Joao S. O. Bueno writes:

  I don't think this behavior is only desirable to unit tests: having
  URL's been formed in predictable way  a good thing in any way one
  thinks about it.

Especially if you're a hacker.  One more thing you may be able to use
against careless sites that don't expect the unexpected to occur in
URLs.

I'm not saying this is a bad thing, but we should remember that the
whole point of PYTHONHASHSEED is that regularities can be exploited
for devious and malicious purposes, and reducing regularity makes many
attacks more difficult.  *Any* way one thinks about it is far too
strong a claim.

Steve




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com