From:             mickoz at parodius dot com
Operating system: All (PHP Behavior)
PHP version:      5.2.4
PHP Bug Type:     Output Control
Bug description:  PHP's url_rewriter url encoding and html escaping behavior 
seems wrong

Description:
------------
When using PHP's url_rewriter (with output_add_rewrite_var) I have noticed
it does this:
- It will urlencode() the value when rewriting an URL
- It will not urlencode() the name when rewriting an URL
- If we set it to write to form:
  - It will create input hidden field after the form tag.
  - It will urlencode() the value when writing the value in a form input
tag
  - It will not urlencode() the name when writing the name in a form input
tag
- On top of that, while writing this, I found it won't htmlspecialchars()
what it write (in no case) // which is probably why we need to change the
arg_separator.output like this "ini_set('arg_separator.output', '&');"

Someone using a name, value pair that don't contain any html or url
special characters won't notice anything.  However, if we use special
characters in any of these, that can cause some problems.

For now it looks like we should avoid special character in name or value
and maybe there is reason for this behavior beyond what I am thinking right
now.  But I believe it should be fixed right.

Also unfortunately doing an htmlspecialchars "smart" encoding might get
problematic, since a lot of people probably use the "arg_separator.output =
&" hack for XHTML compliance... it won't be a smart idea to
automaticelly make it looks like & (after html escaping it). (if my
ideas get implemented, then there will need to be a way to guide this
behavior while staying backward compatible...)

----------------------------------------
Exemple #1:
----------------------------------------

<?php
ini_set('url_rewriter.tags',
'a=href,area=href,frame=src,input=src,form=action,fieldset=');
output_add_rewrite_var("test/name", "test/value");
?>
<a href="/test2">linkname</a>
<form action="/test" method="get">
<input type="text" value="" />
<input type="submit" value="ok" />
</form>

----------------------------------------
will output this:
----------------------------------------

<a href="/test2?test/name=test%2Fvalue">linkname</a>
<form action="/test?test/name=test%2Fvalue" method="get"><input
type="hidden" name="test/name" value="test%2Fvalue" />
<input type="text" value="" />
<input type="submit" value="ok" />
</form>

----------------------------------------
while I would expect:
----------------------------------------

<a href="/test2?test%2Fname=test%2Fvalue">linkname</a>
<form action="/test?test%2Fname=test%2Fvalue" method="get"><input
type="hidden" name="test/name" value="test/value" />
<input type="text" value="" />
<input type="submit" value="ok" />
</form>

----------------------------------------
Exemple #2: (with html special characters)
----------------------------------------

<?php
ini_set('url_rewriter.tags',
'a=href,area=href,frame=src,input=src,form=action,fieldset=');
output_add_rewrite_var("test/name\"&", "test/value\"&");
?>
<a href="/test2">linkname</a>
<form action="/test" method="get">
<input type="text" value="" />
<input type="submit" value="ok" />
</form>

----------------------------------------
will output this:
----------------------------------------

<a href="/test2?test/name"&=test%2Fvalue%22%26">linkname</a>
<form action="/test?test/name"&=test%2Fvalue%22%26" method="get"><input
type="hidden" name="test/name"&" value="test%2Fvalue%22%26" />
<input type="text" value="" />
<input type="submit" value="ok" />
</form>

----------------------------------------
while I would expect:
----------------------------------------

<a href="/test2?test/name%22%26=test%2Fvalue%22%26">linkname</a>
<form action="/test?test/name%22%26=test%2Fvalue%22%26"
method="get"><input type="hidden" name="test/name&quot;&amp;"
value="test/value&quot;&amp;" />
<input type="text" value="" />
<input type="submit" value="ok" />
</form>

========================================
Resume of the behavior I would expect (written in notes+pseudo/php code)
========================================

- When rewriting an URL, append to the original url (that one should have
been espaced properly by the user):
  - "?" if not defined (already doing this)
  -
htmlspecialchars(urlencode($name1).'='.urlencode($value1).'&'.urlencode($name2).'='.urlencode($value2)).'&'...)
    where '&' is the argument separator (and if doing so, we won't need to
put html in an arg separator value!!! which at first did not make sense,
even thought there is maybe reason, especially now that people rely on
it...).
    - urlencode the name and the value
    - html escape what we output (since they are always in html value
zone)

- When adding input type=hidden fields
  - <input type="hidden" name="<?php htmlspecialchars($name1); ?>"
value="<?php htmlspecialchars($value1); ?>" />
  - no url escaping, should not be done here!
    - html escape the name and value as it should be

// there is maybe details that I forgot, but that is a good start and I
can always help think further (and validate the logic) if someone handle
this case

----------------------------------------
Notes:
----------------------------------------

- The problem I got when submitting this bug is that I was using
urlrewriter to write a value like "N/A".  It did add input hidden field in
my forms and one of my form has method="get".  When submitting a form in
method=get, it does escape the form's input value... and since it was
already escaped... the value was escaped twice (which make it impossible to
use that kind of value in that context, unless we add hack around it...)
  example:
  output_add_rewrite_var("var", "N/A");
  become:
  <input type="hidden" name="var" value="N%2FA" />
  then become in URL after submitting the form:
  N%252FA
  (because % get escaped to %25)

- A solution is to never use special characters for name and value when
using php urlrewriter, but still it doesn't feel logical (unless it can
apply to other usage that I am not aware of)

- Another solution might be to write my own output handler

- I believe these kind of thing is at the origin of other related problem
(or "hack" to fix it in PHP) and that fixing it there will avoid need like
needing to change the arg separator with html in it (again there is
probably stuff I don't know which render this impossible, but maybe not)

Hope that help and that will improve thing (or that I will simply
understand why all these behaviors)


-- 
Edit bug report at http://bugs.php.net/?id=42593&edit=1
-- 
Try a CVS snapshot (PHP 4.4): 
http://bugs.php.net/fix.php?id=42593&r=trysnapshot44
Try a CVS snapshot (PHP 5.2): 
http://bugs.php.net/fix.php?id=42593&r=trysnapshot52
Try a CVS snapshot (PHP 6.0): 
http://bugs.php.net/fix.php?id=42593&r=trysnapshot60
Fixed in CVS:                 http://bugs.php.net/fix.php?id=42593&r=fixedcvs
Fixed in release:             
http://bugs.php.net/fix.php?id=42593&r=alreadyfixed
Need backtrace:               http://bugs.php.net/fix.php?id=42593&r=needtrace
Need Reproduce Script:        http://bugs.php.net/fix.php?id=42593&r=needscript
Try newer version:            http://bugs.php.net/fix.php?id=42593&r=oldversion
Not developer issue:          http://bugs.php.net/fix.php?id=42593&r=support
Expected behavior:            http://bugs.php.net/fix.php?id=42593&r=notwrong
Not enough info:              
http://bugs.php.net/fix.php?id=42593&r=notenoughinfo
Submitted twice:              
http://bugs.php.net/fix.php?id=42593&r=submittedtwice
register_globals:             http://bugs.php.net/fix.php?id=42593&r=globals
PHP 3 support discontinued:   http://bugs.php.net/fix.php?id=42593&r=php3
Daylight Savings:             http://bugs.php.net/fix.php?id=42593&r=dst
IIS Stability:                http://bugs.php.net/fix.php?id=42593&r=isapi
Install GNU Sed:              http://bugs.php.net/fix.php?id=42593&r=gnused
Floating point limitations:   http://bugs.php.net/fix.php?id=42593&r=float
No Zend Extensions:           http://bugs.php.net/fix.php?id=42593&r=nozend
MySQL Configuration Error:    http://bugs.php.net/fix.php?id=42593&r=mysqlcfg

Reply via email to