Re: [PHP] htmlentities is incomplete: does not cover rsquo etc

Heddon's Gate Hotel Fri, 13 Mar 2009 18:07:07 -0700

Thanks Jan, it's much clearer now. My knowledge about characterencodings has multiplied 100-fold in the last 24 hours' research.

Would it be a good idea for the PHP Manual to address some of theseissues, by explaining good practice in encoding arbitrary user input informs (for example), for the benefit of those, like me, for whomcharacter sets are a bit of a black art?

Also I still cannot persuade get_html_translation_table to list thosenon-Latin1 entities. This is not an important issue, since it appearsto be only an information function, but it would be nice if it wereconsistent with htmlentities and html_entity_decode.


Eddie

From Jan G.B. 13/03/2009 17:27:

2009/3/13 Heddon's Gate Hotel <[email protected]>:

The string function htmlentities seems to have very incomplete coverage of
the HTML entities listed in the HTML 4 spec.  For example, it does not know
about rsquo, lsquo, rdquo, ldquo, etc.  This is confirmed by looking at the
output of get_html_translation_table, which does not list these entities.

My impression is that it covers those HTML entities that are in ISO-8859-1,
but not the others.  Is this deliberate?  If so, the Manual is misleading
because it suggests that all HTML entities are covered. Otherwise, is this a
bug?


Well, If you specify the input charset you'll have no problem at all. ;)


<?= htmlentities('string with UTF-8: ±ªÐº×N>>µ»n“¢µ€jæ', ENT_QUOTES,
'UTF-8'); ?>

Latin1 AKA ISO-8859-1 doesn't have ldquo nor bdquo nor ndash and alike.

Regards,



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] htmlentities is incomplete: does not cover rsquo etc

Reply via email to