On Tue, Jul 29, 2008 at 12:39 PM, Manlio Perillo <[EMAIL PROTECTED]> wrote: > Bill Janssen ha scritto: >> Actually, it's defined for all fields, isn't it? From RFC 2388: >> >> ``As with all multipart MIME types, each part has an optional >> "Content-Type", which defaults to text/plain.'' >> >> So the type is "text/plain" unless it says something else. And, >> according to RFC 2046, the default charset for "text/plain" is >> "US-ASCII". > > Ok with theory. > But in practice: > > <form action="" method="post" accept-charset="utf-8" > enctype="multipart/form-data"> > [...] > > In theory I should assume ascii encoded data for the body field; and since > this data can not be decoded, I should assume it as byte string. > > However the body field is encoded in utf-8, and if I add an hidden _charset_ > field, FF and IE add this field in the response, with the charset used in > the encoding.
>From what I've seen, most user agents fail to send a Content-Type, much less a charset parameter. Many will also ignore the accept-charset <form> attribute. However most browsers will respectfully send the text fields in a POST response in the same character set that the page which contained the <form> element was sent to the browser to begin with. So if you output HTML pages in UTF-8, the text portions of post messages will be returned in UTF-8. It's not following any standard, but its the way things seem to work. I would think it most useful if the decoding framework would strictly follow the RFC and assume "text/plain; charset=US-ASCII"; but also allow the caller some means of indicating a different default. Obviously, if a user agent does provide a complete Content-Type, it should be used. -- Deron Meranda _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com