[issue30713] Reject newline character (U+000A) in URLs in urllib.parse

Martin Panter Sun, 02 Jul 2017 04:44:19 -0700

Martin Panter added the comment:

It might help if you explained why you want to make these changes. Otherwise I 
have to guess. Is a compromise between strictly rejecting all non-URL 
characters (not just control characters), versus leaving it up to user 
applications to validate their URLs?


I guess it could partially prevent some newline injection problems like Issue 
29606 (FTP) and Issue 30458 (HTTP). But how do we know it closes more security 
holes than it opens?

I don’t understand the focus on these three functions. They are undocumented 
and more-or-less deprecated (Issue 27485). Why not focus on the “urlsplit” and 
“urlparse” functions first?

Some of the changes seem to go too far, e.g. in the 
splithost("//hostname/u\nrl") test case, the hostname is fine, but it is not 
recognized. This would partially conflict the patch in Issue 13359, with 
proposes to percent-encode newlines after passing through “splithost”. And it 
would make the URL look like a relative URL, which is a potential security hole 
and reminds me of the open redirect bug report (Issue 23505).

----------
nosy: +martin.panter

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue30713>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue30713] Reject newline character (U+000A) in URLs in urllib.parse

Reply via email to