#29323: HTTPRequest QueryDict wrongly decodes binary encoded values
---------------------------------+--------------------------------------
Reporter: Thomas Riccardi | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.11
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------
Comment (by Thomas Riccardi):
Replying to [comment:8 Tim Graham]:
> Could you cite your source? [https://www.w3.org/TR/2003/REC-
xforms-20031014/slice11.html#serialize-urlencode I found]:
>
> [`application/x-www-form-urlencoded`] is not suitable for the
persistence of binary content. Therefore, it is recommended that forms
capable of containing binary content use another serialization method.
I believe this is a spec related to XHTML. Firefox [deprecated it in
firefox 19](https://developer.mozilla.org/en-US/docs/Archive/Web/XForms),
XForms 2 is targeted for XML languages (see abstract:
https://www.w3.org/TR/xforms20/ ).
Anyway, it is just a recommendation: the format technically allows
serializing binary: at worst each byte is percent-encoded.
The html5 spec explains the [application/x-www-form-urlencoded byte
serializer](https://url.spec.whatwg.org/#concept-urlencoded-byte-
serializer): it takes a bytes array as input, and returns a string. A
Previous step documents how to generate the bytes array via encoding:
https://url.spec.whatwg.org/#concept-urlencoded-serializer.
So this may be a stretch to serialize raw binary and not a string that we
first encode, but the format would be the same. Only the user API would
change: bytes instead of string.
> [https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/POST and]:
>
> application/x-www-form-urlencoded: the values are encoded in key-value
tuples separated by '&', with a '=' between the key and the value. Non-
alphanumeric characters are percent encoded: this is the reason why this
type is not suitable to use with binary data (use multipart/form-data
instead)
Either "not suitable" means it won't work, which is wrong, or that it's
not recommended, but still would work.
> Maybe a custom middleware could allow your use case.
>
> If you have a fix to offer, we could evaluate it.
I'm OK to write a fix (I initially started to write a simple fix and unit
tests, but I encountered more issues, as previously explained), but I
would require some guidance. The open questions I have are:
- `QueryDict` output type: it should return `bytes` instead of `string` in
this case. How do we expose that? is it OK to have variable types?
- How to ask for bytes? Do we cheat with the `encoding` parameter? Not
sure who is controlling it: the request sender or the developer?
Thanks,
Thomas
--
Ticket URL: <https://code.djangoproject.com/ticket/29323#comment:9>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/073.e281bbcfeb8e3a5ae78d35b752c8186d%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.