ID: 32860 User updated by: ast at gmx dot ch Reported By: ast at gmx dot ch Status: Open Bug Type: Feature/Change Request Operating System: * PHP Version: 4.3.11 New Comment:
Feature/Change request? I don't agree. Handling a HTTP header not according to the RFCs they are defined in doesn't make sense at all. Therefore, it's a bug. But it's not that important to me. Do what you consider the right thing. Previous Comments: ------------------------------------------------------------------------ [2005-04-28 00:00:18] [EMAIL PROTECTED] Reclassified. ------------------------------------------------------------------------ [2005-04-27 23:20:34] ast at gmx dot ch Obviously, the bug report was mangled. Here's a pretty print of the report / fix: http://nei.ch/articles/quoted_string_cookie_fix.php ------------------------------------------------------------------------ [2005-04-27 21:46:18] ast at gmx dot ch Description: ------------ /* * Description: * RFC 2965 describes the HTTP State Management mechanism. * From section "3.1 Syntax: General": * av-pairs = av-pair *(";" av-pair) * av-pair = attr ["=" value] ; optional value * attr = token * value = token | quoted-string * * PHP 4.3.11 does not handle the case of "quoted-string" values. * See RFC 2616 section "2.2 Basic Rules" for a definition of "quoted-string". * quoted-string = ( <"> *(qdtext | quoted-pair ) <"> ) * qdtext = <any TEXT except <">> * * The backslash character ("\") MAY be used as a single-character * quoting mechanism only within quoted-string and comment constructs. * * quoted-pair = "\" CHAR * * PHP 4.3.11 urlencodes all cookie name = value pairs. Therefore, it can handle * values that contain the separators "," and ";". But the RFC 2965 describes that * a HTTP Cookie header sent from the user agent to the server may have av-pairs, * where the value may be a token or a quoted string. * * If one sets a cookie not with PHP's setCookie() method, but directly with header(), * then it is sent correctly to the user agent and the user agent returns it also * correctly. But when PHP reads the HTTP Cookie header back into $_COOKIE, it does * not handle quoted-strings. * * Result: * Wrong cookie values in $_COOKIE. * * The bug is in PHP's source in function * SAPI_API SAPI_TREAT_DATA_FUNC(php_default_treat_data) * It parses the HTTP Cookie header and directly uses "," and ";" as separators. * A slightly more complicated handling of the HTTP Cookie header is required. * In addition to the current code, one has to handle: * - quoted-strings: separators "," and ";" may be in quoted-strings * - double-quote marks escaped by "\" don't end a quoted-string * * Cookies with values that are not urlencoded may come from: * - non-PHP applications on the same host * - RFC 2965 compliant PHP cookies that were set with header() instead of setcookie(). * * Example: * In PHP script: * header('Set-Cookie: TestCookie = "value with , perfectly ; allowed 8-Bit characters ' . * 'and escaped \" double-quote marks!"'); * The cookie is successfully sent to the user agent. The user agent sends it back with a * perfectly intact value. * PHP receives the HTTP Cookie header from the webserver (I inserted the line break): * Cookie: TestCookie="value with , perfectly ; allowed 8-Bit characters and escaped \" * double-quote marks!"\r\n * Then PHP parses the HTTP Cookie header ignoring the case of quoted-strings and fills the * superglobal $_COOKIE with: * ["TestCookie"]=> * string(24) ""value with , perfectly " * ["allowed_8-BIT_characters_and_escaped_\"_double-quote_marks!""]=> * string(0) "" * If PHP handled the HTTP Cookie header correctly, one would get: * ["TestCookie"]=> * string(86) "value with , perfectly ; allowed 8-BIT characters and escaped \" double-quote marks!" * * I even think, if PHP handled "," and ";" as separators, the current PHP should have * created three cookies out of the above name = value pair. * * References: * RFC 2965: http://rfc.net/rfc2965.html * RFC 2616: http://rfc.net/rfc2616.html */ Proposed fix: In the PHP source, replace the simple strtok() call in SAPI_API SAPI_TREAT_DATA_FUNC(php_default_treat_data) by the C++ equivalent of the following PHP code: /** * Fix the superglobal $_COOKIE to conform with RFC 2965 * * This function reevaluates the HTTP Cookie header and populates $_COOKIE with the correct * cookies. */ function fixCookieVars() { if (empty($_SERVER['HTTP_COOKIE'])) { return; } $_COOKIE = array(); /* Check if the Cookie header contains quoted-strings */ if (strstr($_SERVER['HTTP_COOKIE'], '"') === false) { /* * Use fast method, no quoted-strings in the header. * Get rid of less specific cookies if multiple cookies with the same NAME * are present. Do this by going from left/first cookie to right/last cookie. */ $tok = strtok($_SERVER['HTTP_COOKIE'], ',;'); while ($tok) { GalleryUtilities::_registerCookieAttr($tok); $tok = strtok(',;'); } } else { /* * We can't just tokenize the Cookie header string because there are * quoted-strings and delimiters in quoted-string should be handled as values * and not as delimiters. * Thus, we have to parse it character by character. */ $quotedStringOpen = false; $string = $_SERVER['HTTP_COOKIE']; $len = strlen($string); $i = 0; $lastPos = 0; while ($i < $len) { switch ($string{$i}) { case ',': case ';': if ($quotedStringOpen) { /* Ignore separators within quoted-strings */ } else { /* else, this is an attr[=value] pair */ GalleryUtilities::_registerCookieAttr(substr($string, $lastPos, $i)); $lastPos = $i+1; /* next attr starts at next char */ } break; case '"': $quotedStringOpen = !$quotedStringOpen; break; case '\\': /* escape the next character = jump over it */ $i++; break; } $i++; } /* register last attr in header, but only if the syntax is correct */ if (!$quotedStringOpen) { GalleryUtilities::_registerCookieAttr(substr($string, $lastPos)); } } } /** * Register a Cookie Var safely * @param string the cookie var attr, NAME [=VALUE] */ function _registerCookieAttr($attr) { /* * Split NAME [=VALUE], value is optional for all attributes * but the cookie name */ if (($pos = strpos($attr, '=')) !== false) { $val = substr($tok, $pos+1); $key = substr($tok, 0, $pos); } else { /* No cookie name=value attr, we can ignore it */ continue; } /* Urldecode header data (php-style of name = attr handling) */ $key = trim(urldecode($key)); /* Don't accept zero length key */ if (($len = strlen($key)) == 0) { continue; } /* Ommitted: handle array cookies, make the name binary safe, ... */ /* * Don't register non-NAME attributes like domain, path, ... which are all * starting with a dollar sign according to RFC 2965. */ if (strpos($key, '$') === 0) { continue; } /* Urldecode value */ $val = trim(urldecode($val)); /* Addslashes if magic_quotes_gpc is on */ if (get_magic_quotes_gpc()) { $key = addslashes($key); $val = addslashes($val); } if (!isset($_COOKIE[$key])) { $_COOKIE[$key] = $val; } } Reproduce code: --------------- $value= '"value with , perfectly ; allowed 8-BIT characters and escaped \" double-quote marks!"'; $cookie = 'Set-Cookie: TestCookie = '. $value; header($cookie); /* setcookie calls urlEncode($value), so this would work */ // setcookie("TestCookie", $value); if (!empty($_COOKIE)) { print '.<pre>\n'; if (isset($_COOKIE)) { var_dump($_COOKIE); } print "\n\n"; if (isset($_SERVER['HTTP_COOKIE'])) { var_dump($_SERVER['HTTP_COOKIE']); } echo '</pre>'; } else { print 'please refresh the page'; } Expected result: ---------------- After 1 refresh, the var_dump of the $_COOKIE variable should list a single cookie name with the original value. Actual result: -------------- After 1 refresh, $_COOKIE has not 1 cookie name with an associated value, but 2 cookies, because PHP doesn't parse the HTTP Cookie header according to the RFC 2965. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=32860&edit=1