[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2020-01-08 Thread And Clover


And Clover  added the comment:

> Are you saying there is no (http) RFC compliant way to fix this

Sadly, yes.

And though RFCs aren't always a fair representation of real-world use, RFC 7578 
is informative as well as normative: at present nothing produces "filename*=" 
in multipart/form-data.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2020-01-07 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2020-01-07 Thread R. David Murray


R. David Murray  added the comment:

Are you saying there is no (http) RFC compliant way to fix this, or no way to 
fix it with the email library parsers?  If the latter, the library is pretty 
flexible and for internal stdlib use it would probably be permissible to 
directly call methods in the internal parsing module, if those would be useful.

I haven't re-read the issue to reload my brain, so this question may be off 
point (except for the first clause of the question).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2020-01-02 Thread And Clover


And Clover  added the comment:

HTTP generally isn't an RFC 822-family standard. Its headers look a lot like 
it, but they have their own defined syntax that differs in niggling little 
details. Using mail parsing code for HTTP isn't usually the right thing.

HTTP has always used its own syntax definitions for the headers on the main 
request/response entities, but it has traditionally partially deferred to RFC 
822-family specs for the definitions of structured entity bodies. This is moot, 
however, as the reality of what browsers support has rarely coincided with 
those specs.

Nowadays HTML5.2 explicitly defers to RFC 7578 for definition of 
multipart/form-data headers. (This RFC is a replacement for the vague and 
broken RFC 2388.) As is to be expected for an HTML5-related spec, RFC 7578 
shrugs and documents existing browser behaviour [section 4.2]:

- some browsers do UTF-8
- some browsers do data mangling (IE's %-encoding sadness)
- some browsers might do something else

but it explicitly rules out the solution proposed here:

"The encoding method described in [RFC5987], which would add a 'filename*' 
parameter to the Content-Disposition header field, MUST NOT be used."

The introductions of both RFC 5987 and RFC 6266 explicitly exclude 
multipart/form-data headers from their remit.

So in summary:

- we shouldn't do anything
- the situation with submitted filenames will continue to be broken for 
everyone indefinitely

--
nosy: +aclover

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2018-03-22 Thread R. David Murray

R. David Murray  added the comment:

I haven't read the http rfcs, but my understanding is that they follow the MIME 
standards, and the email library already has code to do proper parsing and 
decoding of encoded filenames in Content-Disposition headers.  It should be 
possible to call that code for this use case (the http libraries already depend 
on the email libraries, although I'm not sure if cgi itself does currently).  
There may be additional considerations involved in fully supporting the http 
RFCs, but to determine that someone will need to read both and understand them, 
which is not a small undertaking :)

In the meantime, I'm pretty sure that using the existing mime header parsing 
code in the email library (see email.headerregistry) will provide better 
parsing than the only-handles-simple-cases heuristic in your PR.  Granted, I 
don't think you have to deal with multi-part headers in http, but I vaguely 
remember that there are other subtleties not handled by a simple split on '.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2018-03-22 Thread Paweł

Paweł  added the comment:

I didn't find this and created a duplicate
https://bugs.python.org/issue33027

I've added similar/updated changes
https://github.com/python/cpython/pull/6027

@r.david.murray wouldn't it be wise to do one step at a time rather than 
implementing full support for RFC6266? Please tell exactly what is your 
expectations so I can fix the patch if it needs to be fixed.

This is also related to RFC5987
https://tools.ietf.org/html/rfc5987
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition

--
components: +Unicode
nosy: +ezio.melotti, pawciobiel, vstinner
pull_requests: +5937
stage:  -> patch review
title: RFC6266 support (Content-Disposition for HTTP) -> support encoded 
filename in Content-Disposition for HTTP in cgi.FieldStorage
versions: +Python 2.7, Python 3.4, Python 3.5, Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com