From:             johnfivealive at hotmail dot com
Operating system: Fedora Core 1
PHP version:      5.0.0RC2
PHP Bug Type:     Strings related
Bug description:  The htmlentities functions should detect already converted characters

Description:
------------
The htmlentities function does not attempt to detect whether or not a
character is already part of a character entity or not. For example, the
character '&' is usually represented as '&' in valid XHTML and XML
markup. If one calls htmlentities on the string "&", the function
returns "&". I believe this to be a bug. This function should look
for these types of cases where things like this can go wrong when
converting characters to their entity representations. I imagine this
would only need to be done for the entity beginning with an '&'
character.

This is annoying because consider the following situation. One properly
escapes and converts all characters before inserting the XHTML/XML into a
database, then one pulls out that data to be displayed in an HTML
<textarea></textarea> field. One would usually call htmlentities() on the
content to be displayed in the textarea so everything is rendered
correctly by the browser and the markup is valid. Well in this case if any
of the content in the textarea contains the  "&amp;" entity, then it will
suffer from the bug mentioned above, i.e. all '&' characters will show up
as "&amp;", this is because the underlying html code would look something
like this as a result of the htmlentities function being called:

<textarea>
&lt;em&gt;emphasis tags are escaped correctly&lt;/em&gt;
&lt;br /&gt;
&lt;br /&gt;
but, the &amp;amp; character is not
</textarea>

This would render in a textarea as:

<em>emphasis tags are escaped correctly</em>
<br />
<br />
but, the &amp; character is not

See my frusturation?

Reproduce code:
---------------
echo htmlentities( "&amp;" );


Expected result:
----------------
The above code should detect the entity and return the correct string:
"&amp;" instead of "&amp;amp;"

Actual result:
--------------
The above code returns "&amp;amp;" which it probably should not

-- 
Edit bug report at http://bugs.php.net/?id=28357&edit=1
-- 
Try a CVS snapshot (php4):  http://bugs.php.net/fix.php?id=28357&r=trysnapshot4
Try a CVS snapshot (php5):  http://bugs.php.net/fix.php?id=28357&r=trysnapshot5
Fixed in CVS:               http://bugs.php.net/fix.php?id=28357&r=fixedcvs
Fixed in release:           http://bugs.php.net/fix.php?id=28357&r=alreadyfixed
Need backtrace:             http://bugs.php.net/fix.php?id=28357&r=needtrace
Need Reproduce Script:      http://bugs.php.net/fix.php?id=28357&r=needscript
Try newer version:          http://bugs.php.net/fix.php?id=28357&r=oldversion
Not developer issue:        http://bugs.php.net/fix.php?id=28357&r=support
Expected behavior:          http://bugs.php.net/fix.php?id=28357&r=notwrong
Not enough info:            http://bugs.php.net/fix.php?id=28357&r=notenoughinfo
Submitted twice:            http://bugs.php.net/fix.php?id=28357&r=submittedtwice
register_globals:           http://bugs.php.net/fix.php?id=28357&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=28357&r=php3
Daylight Savings:           http://bugs.php.net/fix.php?id=28357&r=dst
IIS Stability:              http://bugs.php.net/fix.php?id=28357&r=isapi
Install GNU Sed:            http://bugs.php.net/fix.php?id=28357&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=28357&r=float

Reply via email to