Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
Antoine Pitrou writes: That's unsubstantiated. Sure. If I had a CVE, I would have posted it. Give an example of how sorted URLs compromise security. That's not how you think about security; the right question about sorted URLs is how do you know that they *don't* compromise security? We know that mishandling URLs *can* compromise security (eg, via bugs in directory traversal). But you know that. What you presumably mean here is why do you think randomly changing query parameter order in URLs is more secure than sorted order? The answer to that is that since the server can't depend on order, it *must* handle more configurations of parameters by design (and presumably in implementation and testing), and therefore will be robust against more kinds of parameter configurations. Eg, there will be no temptation to optimize processing by handling parameters in sorted order. Is this a real danger? Maybe not. But every unnecessary regularity in inputs that a program's implementation depends on is a potential attack vector via irregular inputs. Remember, I was responding to a claim that sorted order is *always* better. That's a dangerous kind of claim to make about anything that could be input to an Internet server. Steve ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
Joao S. O. Bueno writes: Ageeded that any way one thinks about it is far too strong a claim - but I still hold to the point. Maybe most ways one thinks about it :-) . 100% agreement now.wink/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On Sun, 19 Aug 2012 20:55:31 +0900 Stephen J. Turnbull step...@xemacs.org wrote: Antoine Pitrou writes: That's unsubstantiated. Sure. If I had a CVE, I would have posted it. Ok, so you have no evidence. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On Sat, Aug 18, 2012 at 1:55 PM, Glenn Linderman v+pyt...@g.nevcal.com wrote: On 8/18/2012 11:47 AM, MRAB wrote: I vote -0. The issue can also be addressed with a small and simple helper function that wraps urlparse and compares the query parameter. Or you cann urlencode() with `sorted(qs.items)` instead of `qs` in the application. Hm. That's actually a good point. Seems adequate to me. Most programs wouldn't care about the order, because most web frameworks grab whatever is there in whatever order, and present it to the web app in their own order. Programs that care, or which talk to web apps that care, are unlikely to want the order from a non-randomized dict, and so have already taken care of ordering issues, so undoing the randomization seems like a solution in search of a problem (other than for poorly written test cases). I am of the same thought too. Changing a behavior based on the test case expectation, no matter if the behavior is a harmless change is still a change. Coming to the point testing query string could be useful in some cases and then giving weightage to the change seems interesting use case, but does not seem to warrant a change. I think, I like Christian Heimes suggestion that a wrapper to compare query strings would be useful and in Guido's original test case, a tittle test code change would have been good. Looks like Guido has withdrawn the bug report too. -- Senthil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
Guido van Rossum wrote: I wonder if it wouldn't make sense to change urlencode() to generate URLs that don't depend on the hash order, for all versions of Python that support PYTHONHASHSEED? It seems a one-line fix: query = query.items() with this: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. Thoughts? There may be people who mix bytes and str or pass other non-str keys: query = {ba:bb, c:d, 5:6} urlencode(query) 'a=bc=d5=6' sorted(query.items()) Traceback (most recent call last): File stdin, line 1, in module TypeError: unorderable types: str() bytes() Not pretty, but a bugfix should not break such constructs. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On Sat, 18 Aug 2012 14:23:13 +0900 Stephen J. Turnbull step...@xemacs.org wrote: Joao S. O. Bueno writes: I don't think this behavior is only desirable to unit tests: having URL's been formed in predictable way a good thing in any way one thinks about it. Especially if you're a hacker. One more thing you may be able to use against careless sites that don't expect the unexpected to occur in URLs. That's unsubstantiated. Give an example of how sorted URLs compromise security. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On 18 August 2012 02:23, Stephen J. Turnbull step...@xemacs.org wrote: Joao S. O. Bueno writes: I don't think this behavior is only desirable to unit tests: having URL's been formed in predictable way a good thing in any way one thinks about it. Especially if you're a hacker. One more thing you may be able to use against careless sites that don't expect the unexpected to occur in URLs. I'm not saying this is a bad thing, but we should remember that the whole point of PYTHONHASHSEED is that regularities can be exploited for devious and malicious purposes, and reducing regularity makes many attacks more difficult. *Any* way one thinks about it is far too strong a claim. Ageeded that any way one thinks about it is far too strong a claim - but I still hold to the point. Maybe most ways one thinks about it :-) . Steve ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
Am 17.08.2012 21:27, schrieb Guido van Rossum: I wonder if it wouldn't make sense to change urlencode() to generate URLs that don't depend on the hash order, for all versions of Python that support PYTHONHASHSEED? It seems a one-line fix: query = query.items() with this: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. I vote -0. The issue can also be addressed with a small and simple helper function that wraps urlparse and compares the query parameter. Or you cann urlencode() with `sorted(qs.items)` instead of `qs` in the application. The order of query string parameter is actually important for some applications, for example Zope, colander+deform and other form frameworks use the parameter order to group parameters. Therefore I propose that the query string is only sorted when the query is exactly a dict and not some subclass or class that has an items() method. if type(query) is dict: query = sorted(query.items()) else: query = query.items() Christian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On Sat, Aug 18, 2012 at 6:28 AM, Christian Heimes li...@cheimes.de wrote: Am 17.08.2012 21:27, schrieb Guido van Rossum: I wonder if it wouldn't make sense to change urlencode() to generate URLs that don't depend on the hash order, for all versions of Python that support PYTHONHASHSEED? It seems a one-line fix: query = query.items() with this: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. I vote -0. The issue can also be addressed with a small and simple helper function that wraps urlparse and compares the query parameter. Or you cann urlencode() with `sorted(qs.items)` instead of `qs` in the application. Hm. That's actually a good point. The order of query string parameter is actually important for some applications, for example Zope, colander+deform and other form frameworks use the parameter order to group parameters. Therefore I propose that the query string is only sorted when the query is exactly a dict and not some subclass or class that has an items() method. if type(query) is dict: query = sorted(query.items()) else: query = query.items() That's already in the bug I filed. :-) I also added that the sort may fail if the keys mix e.g. bytes and str (or int and str, for that matter). -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On 18/08/2012 18:34, Guido van Rossum wrote: On Sat, Aug 18, 2012 at 6:28 AM, Christian Heimes li...@cheimes.de wrote: Am 17.08.2012 21:27, schrieb Guido van Rossum: I wonder if it wouldn't make sense to change urlencode() to generate URLs that don't depend on the hash order, for all versions of Python that support PYTHONHASHSEED? It seems a one-line fix: query = query.items() with this: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. I vote -0. The issue can also be addressed with a small and simple helper function that wraps urlparse and compares the query parameter. Or you cann urlencode() with `sorted(qs.items)` instead of `qs` in the application. Hm. That's actually a good point. The order of query string parameter is actually important for some applications, for example Zope, colander+deform and other form frameworks use the parameter order to group parameters. Therefore I propose that the query string is only sorted when the query is exactly a dict and not some subclass or class that has an items() method. if type(query) is dict: query = sorted(query.items()) else: query = query.items() That's already in the bug I filed. :-) I also added that the sort may fail if the keys mix e.g. bytes and str (or int and str, for that matter). One possible way around that is to add the class names, perhaps only if sorting raises an exception: def make_key(pair): return type(pair[0]).__name__, type(pair[1]).__name__, pair if type(query) is dict: try: query = sorted(query.items()) except TypeError: query = sorted(query.items(), key=make_key) else: query = query.items() ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On Saturday, August 18, 2012, MRAB wrote: On 18/08/2012 18:34, Guido van Rossum wrote: On Sat, Aug 18, 2012 at 6:28 AM, Christian Heimes li...@cheimes.de wrote: Am 17.08.2012 21:27, schrieb Guido van Rossum: I wonder if it wouldn't make sense to change urlencode() to generate URLs that don't depend on the hash order, for all versions of Python that support PYTHONHASHSEED? It seems a one-line fix: query = query.items() with this: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. I vote -0. The issue can also be addressed with a small and simple helper function that wraps urlparse and compares the query parameter. Or you cann urlencode() with `sorted(qs.items)` instead of `qs` in the application. Hm. That's actually a good point. The order of query string parameter is actually important for some applications, for example Zope, colander+deform and other form frameworks use the parameter order to group parameters. Therefore I propose that the query string is only sorted when the query is exactly a dict and not some subclass or class that has an items() method. if type(query) is dict: query = sorted(query.items()) else: query = query.items() That's already in the bug I filed. :-) I also added that the sort may fail if the keys mix e.g. bytes and str (or int and str, for that matter). One possible way around that is to add the class names, perhaps only if sorting raises an exception: def make_key(pair): return type(pair[0]).__name__, type(pair[1]).__name__, pair if type(query) is dict: try: query = sorted(query.items()) except TypeError: query = sorted(query.items(), key=make_key) else: query = query.items() Doesn't strike me as necessary. __**_ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** guido%40python.orghttp://mail.python.org/mailman/options/python-dev/guido%40python.org -- Sent from Gmail Mobile ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On 8/18/2012 11:47 AM, MRAB wrote: I vote -0. The issue can also be addressed with a small and simple helper function that wraps urlparse and compares the query parameter. Or you cann urlencode() with `sorted(qs.items)` instead of `qs` in the application. Hm. That's actually a good point. Seems adequate to me. Most programs wouldn't care about the order, because most web frameworks grab whatever is there in whatever order, and present it to the web app in their own order. Programs that care, or which talk to web apps that care, are unlikely to want the order from a non-randomized dict, and so have already taken care of ordering issues, so undoing the randomization seems like a solution in search of a problem (other than for poorly written test cases). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
I just fixed a unittest for some code used at Google that was comparing a url generated by urllib.encode() to a fixed string. The problem was caused by turning on PYTHONHASHSEED=1. Because of this, the code under test would generate a textually different URL each time the test was run, but the intention of the test was just to check that all the query parameters were present and equal to the expected values. The solution was somewhat painful, I had to parse the url, split the query parameters, and compare them to a known dict. I wonder if it wouldn't make sense to change urlencode() to generate URLs that don't depend on the hash order, for all versions of Python that support PYTHONHASHSEED? It seems a one-line fix: query = query.items() with this: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. Thoughts? -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
Only if it is not an OrderedDict query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On 17.08.2012 21:27, Guido van Rossum wrote: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. Thoughts? Sounds good. For best backwards compatibility, I'd restrict the sorting to the exact dict type, since people may be using non-dict mappings which already have a different stable order. for all versions of Python that support PYTHONHASHSEED? I think this cannot be done, in particular not for 2.6 and 3.1 - it's not a security fix (*). Strictly speaking, it isn't even a bug fix, since it doesn't restore the original behavior that some people (like your test case) relied on. In particular, if somebody has fixed PYTHONHASHSEED to get a stable order, this change would break such installations. By that policy, it could only go into 3.4. OTOH, if it also checked whether there is randomized hashing, and sort only in that case, I think it should be backwards compatible in all interesting cases. Regards, Martin (*) I guess some may claim that the current implementation leaks some bits of the hash seed, since you can learn the seed from the parameter order, so sorting would make it more secure. However, I would disagree that this constitutes a feasible threat. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
Thanks, I filed http://bugs.python.org/issue15719 to track this. On Fri, Aug 17, 2012 at 12:50 PM, Martin v. Löwis mar...@v.loewis.de wrote: On 17.08.2012 21:27, Guido van Rossum wrote: query = sorted(query.items()) This would not prevent breakage of unit tests, but it would make a much simpler fix possible: simply sort the parameters in the URL. Thoughts? Sounds good. For best backwards compatibility, I'd restrict the sorting to the exact dict type, since people may be using non-dict mappings which already have a different stable order. for all versions of Python that support PYTHONHASHSEED? I think this cannot be done, in particular not for 2.6 and 3.1 - it's not a security fix (*). Strictly speaking, it isn't even a bug fix, since it doesn't restore the original behavior that some people (like your test case) relied on. In particular, if somebody has fixed PYTHONHASHSEED to get a stable order, this change would break such installations. By that policy, it could only go into 3.4. OTOH, if it also checked whether there is randomized hashing, and sort only in that case, I think it should be backwards compatible in all interesting cases. Regards, Martin (*) I guess some may claim that the current implementation leaks some bits of the hash seed, since you can learn the seed from the parameter order, so sorting would make it more secure. However, I would disagree that this constitutes a feasible threat. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On Sat, Aug 18, 2012 at 5:27 AM, Guido van Rossum gu...@python.org wrote: I just fixed a unittest for some code used at Google that was comparing a url generated by urllib.encode() to a fixed string. The problem was caused by turning on PYTHONHASHSEED=1. Because of this, the code under test would generate a textually different URL each time the test was run, but the intention of the test was just to check that all the query parameters were present and equal to the expected values. query = sorted(query.items()) Hmm. ISTM this is putting priority on the unit test above the functionality of actual usage. Although on the other hand, sorting parameters on a URL is nothing compared to the cost of network traffic, so it's unlikely to be significant. Chris Angelico ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
On 17 August 2012 18:52, Chris Angelico ros...@gmail.com wrote: On Sat, Aug 18, 2012 at 5:27 AM, Guido van Rossum gu...@python.org wrote: I just fixed a unittest for some code used at Google that was comparing a url generated by urllib.encode() to a fixed string. The problem was caused by turning on PYTHONHASHSEED=1. Because of this, the code under test would generate a textually different URL each time the test was run, but the intention of the test was just to check that all the query parameters were present and equal to the expected values. query = sorted(query.items()) Hmm. ISTM this is putting priority on the unit test above the functionality of actual usage. Although on the other hand, sorting parameters on a URL is nothing compared to the cost of network traffic, so it's unlikely to be significant. I don't think this behavior is only desirable to unit tests: having URL's been formed in predictable way a good thing in any way one thinks about it. js -- Chris Angelico ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should urlencode() sort the query parameters (if they come from a dict)?
Joao S. O. Bueno writes: I don't think this behavior is only desirable to unit tests: having URL's been formed in predictable way a good thing in any way one thinks about it. Especially if you're a hacker. One more thing you may be able to use against careless sites that don't expect the unexpected to occur in URLs. I'm not saying this is a bad thing, but we should remember that the whole point of PYTHONHASHSEED is that regularities can be exploited for devious and malicious purposes, and reducing regularity makes many attacks more difficult. *Any* way one thinks about it is far too strong a claim. Steve ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com