ID: 48219
User updated by: carsten_sttgt at gmx dot de
Reported By: carsten_sttgt at gmx dot de
Status: Open
Bug Type: Feature/Change Request
Operating System: *
PHP Version: 5.*, 6CVS (2009-05-09)
New Comment:
After a quick view to rfc1867.c, I found a lot of:
| #if HAVE_MBSTRING && !defined(COMPILE_DL_MBSTRING)
So I guess a correct behavior, according to rfc2616/rfc1867, is only
possible and working, if you have the mbstring extension, and if this is
not a shared extension. (why does this not work with a shared
extension?)
(can't test this, because this extension is always shared in my
installations.)
It's like bug #37860:
A HTTP UA is sending such a valid POST request and PHP is answering
with a status 200. And both, browser an script, must assume all is ok.
Instead the data is garbled.
In contrast to bug #37860, it's not defined to return a status 415,
(but maybe the best solution for now?).
In case of bug #37860, the return status 415 is defined for such
situation. But PHP is also not doing this :-/ Also a problem, if all
parts are thinking the POST request is OK.
Regards,
Carsten
Previous Comments:
------------------------------------------------------------------------
[2009-05-10 19:15:16] carsten_sttgt at gmx dot de
> And this is (I think :) related also to bug #37860
Yes, it's similar. BTW. I think bug #37860 is a feature request and
also a bug.
- Feature: It would be nice, if PHP is decoding the data
if the coding is known (see rfc2616-sec3.5/-sec14.1.
e.g. if the gzip-extension is loaded and
"Content-Encoding: gzip" is set in the request
- Bug: if PHP can't/won't do this, it should raise/return a HTTP
status code of 415. (See rfc2616-sec14.11)
> Unfortunately this is a feature request so reclassifying as such.
That's really something I was unsure about. See rfc2616-sec3.5. In
general:
| an HTTP user agent SHOULD follow the same or similar behavior
| as a MIME user agent would upon receipt of a multipart type.
| The MIME header fields within each body-part of a multipart
| message- body do not have any significance to HTTP beyond that
| defined by their MIME semantics.
Well, a MIME user agent must decode such data. Because of the "should"
in this statement, it /can/ be a feature request (but "should" is more
restrictive than a "may" / "optional".).
But same section rfc2616-sec3.5:
Note: The "multipart/form-data" type has been specifically
defined for carrying form data suitable for processing
via the POST request method, as described in RFC 1867 [15].
And in rfc1867 (or the newer rfc2388), Content-Transfer-Encoding is
explicit part of the rfc. So I think a HTTP software should know and
handle Content-Transfer-Encoding. Well, Perls' CGI.pm also is not doing
this ;-)
BTW:
In difference to the Content-Encoding, I can't see the
Content-Transfer-Encoding in the script. So that can be really a
problem. But using a Content-Transfer-Encoding is not usual (or is it
not usual, because Perl/PHP can't handle this?)
> btw. Fastest way to get this implemented is by providing a patch. :)
Yeah, if my C would be better... ;-)
------------------------------------------------------------------------
[2009-05-10 17:02:30] [email protected]
btw. Fastest way to get this implemented is by providing a patch. :)
------------------------------------------------------------------------
[2009-05-10 17:01:57] [email protected]
Unfortunately this is a feature request so reclassifying as such.
And this is (I think :) related also to bug #37860 and maybe some
others
I couldn't find. :)
------------------------------------------------------------------------
[2009-05-10 10:49:28] carsten_sttgt at gmx dot de
Description:
------------
Hallo,
In a HTTP POST request and Content-Type "multipart/form-data", each
part can have a Content-Transfer-Encoding, which is defined in RFC2045.
(See also HTML 4.01-sec17.13.4.2)
PHP only works with 7bit, 8bit and binary, because with these values,
the data is not transformed.
With base64 or quoted-printabled, the data is transformed (encoded),
and PHP should decode it (see also rfc2616-sec3.7.2 / rfc1867-sec3.3).
Just test the above example from RFC2388-sec4.5. That's also a problem,
if you upload a file with such a transfer encoding. After
move_uploaded_files, the content of such file is not really what you
aspect.
And in a script, which receives such data, I don't see (can't know), if
there was a Content-Transfer-Encoding for something in $_POST / $_FILES.
Maybe not usual, but a Client can use such a Content-Transfer-Encoding
at any time in a POST request.
Regards,
Carsten
Reproduce code:
---------------
Create a simple "test.php" in your DocumentRoot:
==================
<?php
var_dump($_POST);
?>
==================
Telnet to localhost:80 and send this request:
======================================================
POST http://localhost/test.php HTTP/1.0
Content-Length: 181
Content-Type: multipart/form-data; boundary=AaB03x
--AaB03x
Content-Disposition: form-data; name="field1"
Content-Type: text/plain;charset=windows-1250
Content-Transfer-Encoding: quoted-printable
Joe owes =80100.
--AaB03x--
======================================================
Expected result:
----------------
array(1) {
["field1"]=>
string(14) "Joe owes 100."
}
Actual result:
--------------
array(1) {
["field1"]=>
string(16) "Joe owes =80100."
}
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=48219&edit=1