#29323: HTTPRequest QueryDict wrongly decodes binary encoded values
---------------------------------+--------------------------------------
     Reporter:  Thomas Riccardi  |                    Owner:  nobody
         Type:  Bug              |                   Status:  closed
    Component:  HTTP handling    |                  Version:  1.11
     Severity:  Normal           |               Resolution:  wontfix
     Keywords:                   |             Triage Stage:  Unreviewed
    Has patch:  0                |      Needs documentation:  0
  Needs tests:  0                |  Patch needs improvement:  0
Easy pickings:  0                |                    UI/UX:  0
---------------------------------+--------------------------------------

Comment (by Thomas Riccardi):

 Replying to [comment:8 Tim Graham]:
 > Could you cite your source? [https://www.w3.org/TR/2003/REC-
 xforms-20031014/slice11.html#serialize-urlencode I found]:
 >
 >  [`application/x-www-form-urlencoded`] is not suitable for the
 persistence of binary content. Therefore, it is recommended that forms
 capable of containing binary content use another serialization method.

 I believe this is a spec related to XHTML. Firefox [deprecated it in
 firefox 19](https://developer.mozilla.org/en-US/docs/Archive/Web/XForms),
 XForms 2 is targeted for XML languages (see abstract:
 https://www.w3.org/TR/xforms20/ ).

 Anyway, it is just a recommendation: the format technically allows
 serializing binary: at worst each byte is percent-encoded.

 The html5 spec explains the [application/x-www-form-urlencoded byte
 serializer](https://url.spec.whatwg.org/#concept-urlencoded-byte-
 serializer): it takes a bytes array as input, and returns a string. A
 Previous step documents how to generate the bytes array via encoding:
 https://url.spec.whatwg.org/#concept-urlencoded-serializer.

 So this may be a stretch to serialize raw binary and not a string that we
 first encode, but the format would be the same. Only the user API would
 change: bytes instead of string.

 > [https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/POST and]:
 >
 >  application/x-www-form-urlencoded: the values are encoded in key-value
 tuples separated by '&', with a '=' between the key and the value. Non-
 alphanumeric characters are percent encoded: this is the reason why this
 type is not suitable to use with binary data (use multipart/form-data
 instead)
 Either "not suitable" means it won't work, which is wrong, or that it's
 not recommended, but still would work.

 > Maybe a custom middleware could allow your use case.
 >
 > If you have a fix to offer, we could evaluate it.

 I'm OK to write a fix (I initially started to write a simple fix and unit
 tests, but I encountered more issues, as previously explained), but I
 would require some guidance. The open questions I have are:
 - `QueryDict` output type: it should return `bytes` instead of `string` in
 this case. How do we expose that? is it OK to have variable types?
 - How to ask for bytes? Do we cheat with the `encoding` parameter? Not
 sure who is controlling it: the request sender or the developer?


 Thanks,
 Thomas

-- 
Ticket URL: <https://code.djangoproject.com/ticket/29323#comment:9>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/073.e281bbcfeb8e3a5ae78d35b752c8186d%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to