[issue15564] cgi.FieldStorage should not call read_multi on files

Glenn Linderman Sat, 11 Aug 2012 13:34:11 -0700

Glenn Linderman added the comment:

I didn't call the current behaviour of browsers in assigning MIME types 
automatically based on file extension a bug; I would consider it more of a 
missing capability, an oversight due to the rareness of attempts to upload 
MHTML files. This is similar to the situation of email clients automatically 
choosing the Content-Disposition for attachments (which is just a 
recommendation) about whether to suggest they be displayed inline, or provided 
as attachments to be saved. Most automatically select a Content-Disposition 
based on their own capability to deal with an attachment of a particular MIME 
type, rather than the (unknown) capability of the email client of the ultimate 
recipient. I think in both cases, the default behavior works well enough for a 
large enough subset of cases, that there has been little demand for increased 
functionality, even though one can contrive reasonable sounding cases for that 
functionality.


As a point of discussion, my perception is that MHTML files have two uses: to 
email an image of a web page (something typically done implicitly by bundled 
email/web-browser client software, and not generally explicit in the creation 
of a standalone MHTML file), and to archive a web page for local reference. 
Neither of these uses involves upload MHTML files to web sites, although saving 
a web page, and then attempting to email it to a friend as an attachment via a 
web mail client might encounter the same difficulty you are having.

Another use I have heard discussed (but I've forgotten where, so have no 
references), is as a source for custom browsers to prepackage responses for 
particular WEB forms.  In that case, I think it would be the custom browser's 
responsibility to supply the MHTML file content as a response to the form 
request, rather than to supply it as an uploaded file, expecting the server to 
dissect it... 

I think it is obvious that my personal, first reaction is that the parsing 
problem should be fixed... if the MIME type states it is multipart, it should 
dissected into its parts... and if that is not the desired behavior, then the 
MIME type should be different.  Email standards, the source of MIME type 
specifications, certainly use and support nested multipart dissection, although 
various email software performs it in various manners and to various levels. 
Naturally, if the content syntax of the multipart file is incorrect, it should 
produce an exception, the same as if the multipart content a (buggy) browser 
produced from an HTML form were syntactically incorrect.

Given a lack of capability of browser to allow specification of MIME type (this 
is .mht, but treat it as application/octet-stream rather than 
multipart/related), it does seem that web server toolkits such as 
cgi.FieldStorage might want to offer an option or hook to allow an application 
to disable the otherwise automatic parsing of multipart/* files.

This is a rather murky area, indeed. Research into whether and how other web 
toolkits handle such a situation would be interesting in deciding how to 
proceed. While there is no need for Python to slavishly follow the lead of any 
other particular web toolkit, it would be interesting to know if any actually 
successfully parse such files, and it would be interesting to know if any 
ignore the MIME type for uploaded files, and it would be interesting to know if 
any support options for handling uploaded files with multipart/* MIME types.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15564>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15564] cgi.FieldStorage should not call read_multi on files

Reply via email to