On Jul 21, 2019, at 14:13, Barry <ba...@barrys-emacs.org> wrote:
> 
>>> On 21 Jul 2019, at 19:03, Steven D'Aprano <st...@pearwood.info> wrote:
>>> 
>>> On Sun, Jul 21, 2019 at 08:48:49AM +0100, Barry Scott wrote:
>>> 
>>> I took at very quick look at bpo30500 and was struck by the comment 
>>> that the code was working on a URL that had not been validated.
>>> 
>>> Validation of the URL would reject the URL before the parsing happens 
>>> in this case. Was that the case?
>> 
>> Sorry, can you elaborate on that? How do you validate a URL without 
>> attempting to parse it? You're surely not talking about looking it up in 
>> a whitelist are you?
> 
> I was thinking about ensuring the the characters in the url are from the 
> subset that is allowed. \n is not allowed for example. Yes agree you have a 
> try to parse it.

For a spec that has different sets of restricted characters for different 
parts, that kind of prevalidation doesn’t seem like it would get you very far. 
At least a priori, if there are attacks that involve using illegal characters 
in the netloc or the path or the scheme or whatever, they could just as easily 
be characters that are legal elsewhere in the URL as characters that happen to 
not be  legal anywhere. 

(If you’re just talking about mitigating one particular attack after it’s been 
discovered, that’s a different story. If checking for \n patches things without 
waiting for the library fix, obviously it’s worth doing.)
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KTJBOL44F7IAQS6J2P3ANHWPM65PPIFD/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to