#18004: Django should not use `force_unicode(..., errors='replace')` when
parsing
POST data.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: aaugustin
Type: Bug | Status: assigned
Component: HTTP handling | Version: master
Severity: Normal | Resolution:
Keywords: post data unicode | Triage Stage:
utf8 encode decode transaction | Unreviewed
aborted | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by aaugustin):
* stage: Design decision needed => Unreviewed
Comment:
Yes, I have strong objections to your proposal: I'm not going to add a
workaround for a problem that we haven't identified yet.
All of the above is vague, and we still don't know how to trigger this
error.
----
I did the research, and RFC 1867 says that file names must be encoded:
> The client application should make best
> effort to supply the file name; if the file name of the client's
> operating system is not in US-ASCII, the file name might be
> approximated or encoded using the method of RFC 1522.
This is repeated in section 5.11 - Non-ASCII field names:
> Note that mime headers are generally required to consist only of 7-
> bit data in the US-ASCII character set. Hence field names should be
> encoded according to the prescriptions of RFC 1522 if they contain
> characters outside of that set. In HTML 2.0, the default character
> set is ISO-8859-1, but non-ASCII characters in field names should be
> encoded.
RFC 1522 describes mime-encoding, and this encoding explicitly includes
the charset.
----
Note that you're the only person to have ever hit this problem; for all I
know this could be a bug in your code. The only way to be sure is to log a
request, and figure out why Django can't parse it.
Here's what I would suggest: in the problematic view, catch the
`DatabaseError`, and when it occurs, dump `request.body` in a file in
binary mode. Once we have this file, we can figure out why Django ends up
with invalid utf8 data.
--
Ticket URL: <https://code.djangoproject.com/ticket/18004#comment:14>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.