Edit report at https://bugs.php.net/bug.php?id=62861&edit=1
ID: 62861 Updated by: ras...@php.net Reported by: soapergem at gmail dot com Summary: htmlentities returns empty string when it shouldn't -Status: Open +Status: Not a bug Type: Bug Package: *General Issues Operating System: Windows PHP Version: 5.4.6 Block user comment: N Private report: N New Comment: UTF-8 is only compatible with low-ascii, not with high. The copyright symbol in ISO-8859-1 is character code (in hex) <A9>. In UTF-8 the copyright symbol is represented by two bytes, <C2><A9>. The world has gone UTF-8. If your editor is in UTF-8 mode and you enter/paste a copyright symbol and pass it to htmlentities() you will get "©" back. So rather than change the code to hardcode ISO-8859-1 you should convert your datasources to UTF-8. Most of them are probably already UTF-8 which means that your current code was likely not handling these correctly since it assumed ISO-8859-1 before. For some perspetive: http://w3techs.com/technologies/overview/character_encoding/all which shows that 72% of the top-million sites on the Web are using UTF-8. And this number is growing. Previous Comments: ------------------------------------------------------------------------ [2012-08-19 04:14:03] soapergem at gmail dot com Description: ------------ Doesn't UTF-8 include basic ASCII characters, too? Right now when I try to encode the copyright symbol (©) using htmlentities (it should encode to ©), it doesn't work. I discovered this since the default encoding for htmlentities() was switched from ISO-8859-1 to UTF-8 in version 5.4. I have plenty of places where I rely on basic symbols, such as the copyright symbol, being encoded properly with htmlentities(). Having to go in and change all the instances of htmlentities($string) to htmlentities($string, ENT_COMPAT | ENT_HTML401, 'ISO-8859-1') is not practical (there are MANY). And with the whole output of the function being blank, it just makes my scripts completely unusable now. Help! Test script: --------------- <?php echo htmlentities('©', ENT_COMPAT | ENT_HTML401, 'UTF-8'); ?> Expected result: ---------------- © Actual result: -------------- (Nothing - an empty string) ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=62861&edit=1