ID: 48219
Comment by: codeslinger at compsalot dot com
Reported By: carsten_sttgt at gmx dot de
Status: Open
Bug Type: Feature/Change Request
Operating System: *
PHP Version: 5.*, 6CVS (2009-05-09)
New Comment:
this also afflicts Base64 encoding which is a massively prevalent
method for binary transfers....
I am really surprised to encounter this *bug*
It means that everything php is doing with regard to saving/moving
uploaded files is wasted/useless effort. Since the content transfer
type is not even accessible, we must instead do our own parsing of the
raw post data. How can that be by design???
Previous Comments:
------------------------------------------------------------------------
[2009-05-15 00:14:44] carsten_sttgt at gmx dot de
After a quick view to rfc1867.c, I found a lot of:
| #if HAVE_MBSTRING && !defined(COMPILE_DL_MBSTRING)
So I guess a correct behavior, according to rfc2616/rfc1867, is only
possible and working, if you have the mbstring extension, and if this is
not a shared extension. (why does this not work with a shared
extension?)
(can't test this, because this extension is always shared in my
installations.)
It's like bug #37860:
A HTTP UA is sending such a valid POST request and PHP is answering
with a status 200. And both, browser an script, must assume all is ok.
Instead the data is garbled.
In contrast to bug #37860, it's not defined to return a status 415,
(but maybe the best solution for now?).
In case of bug #37860, the return status 415 is defined for such
situation. But PHP is also not doing this :-/ Also a problem, if all
parts are thinking the POST request is OK.
Regards,
Carsten
------------------------------------------------------------------------
[2009-05-10 19:15:16] carsten_sttgt at gmx dot de
> And this is (I think :) related also to bug #37860
Yes, it's similar. BTW. I think bug #37860 is a feature request and
also a bug.
- Feature: It would be nice, if PHP is decoding the data
if the coding is known (see rfc2616-sec3.5/-sec14.1.
e.g. if the gzip-extension is loaded and
"Content-Encoding: gzip" is set in the request
- Bug: if PHP can't/won't do this, it should raise/return a HTTP
status code of 415. (See rfc2616-sec14.11)
> Unfortunately this is a feature request so reclassifying as such.
That's really something I was unsure about. See rfc2616-sec3.5. In
general:
| an HTTP user agent SHOULD follow the same or similar behavior
| as a MIME user agent would upon receipt of a multipart type.
| The MIME header fields within each body-part of a multipart
| message- body do not have any significance to HTTP beyond that
| defined by their MIME semantics.
Well, a MIME user agent must decode such data. Because of the "should"
in this statement, it /can/ be a feature request (but "should" is more
restrictive than a "may" / "optional".).
But same section rfc2616-sec3.5:
Note: The "multipart/form-data" type has been specifically
defined for carrying form data suitable for processing
via the POST request method, as described in RFC 1867 [15].
And in rfc1867 (or the newer rfc2388), Content-Transfer-Encoding is
explicit part of the rfc. So I think a HTTP software should know and
handle Content-Transfer-Encoding. Well, Perls' CGI.pm also is not doing
this ;-)
BTW:
In difference to the Content-Encoding, I can't see the
Content-Transfer-Encoding in the script. So that can be really a
problem. But using a Content-Transfer-Encoding is not usual (or is it
not usual, because Perl/PHP can't handle this?)
> btw. Fastest way to get this implemented is by providing a patch. :)
Yeah, if my C would be better... ;-)
------------------------------------------------------------------------
[2009-05-10 17:02:30] [email protected]
btw. Fastest way to get this implemented is by providing a patch. :)
------------------------------------------------------------------------
[2009-05-10 17:01:57] [email protected]
Unfortunately this is a feature request so reclassifying as such.
And this is (I think :) related also to bug #37860 and maybe some
others
I couldn't find. :)
------------------------------------------------------------------------
[2009-05-10 10:49:28] carsten_sttgt at gmx dot de
Description:
------------
Hallo,
In a HTTP POST request and Content-Type "multipart/form-data", each
part can have a Content-Transfer-Encoding, which is defined in RFC2045.
(See also HTML 4.01-sec17.13.4.2)
PHP only works with 7bit, 8bit and binary, because with these values,
the data is not transformed.
With base64 or quoted-printabled, the data is transformed (encoded),
and PHP should decode it (see also rfc2616-sec3.7.2 / rfc1867-sec3.3).
Just test the above example from RFC2388-sec4.5. That's also a problem,
if you upload a file with such a transfer encoding. After
move_uploaded_files, the content of such file is not really what you
aspect.
And in a script, which receives such data, I don't see (can't know), if
there was a Content-Transfer-Encoding for something in $_POST / $_FILES.
Maybe not usual, but a Client can use such a Content-Transfer-Encoding
at any time in a POST request.
Regards,
Carsten
Reproduce code:
---------------
Create a simple "test.php" in your DocumentRoot:
==================
<?php
var_dump($_POST);
?>
==================
Telnet to localhost:80 and send this request:
======================================================
POST http://localhost/test.php HTTP/1.0
Content-Length: 181
Content-Type: multipart/form-data; boundary=AaB03x
--AaB03x
Content-Disposition: form-data; name="field1"
Content-Type: text/plain;charset=windows-1250
Content-Transfer-Encoding: quoted-printable
Joe owes =80100.
--AaB03x--
======================================================
Expected result:
----------------
array(1) {
["field1"]=>
string(14) "Joe owes 100."
}
Actual result:
--------------
array(1) {
["field1"]=>
string(16) "Joe owes =80100."
}
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=48219&edit=1