Edit report at https://bugs.php.net/bug.php?id=66618&edit=1
ID: 66618 User updated by: francois dot gannaz at silecs dot info Reported by: francois dot gannaz at silecs dot info Summary: UTF-8 encoding error -Status: Feedback +Status: Open Type: Bug Package: Website problem PHP Version: Irrelevant Block user comment: N Private report: N New Comment: Removing one comment did not change the encoding problem. Here is the direct link to the W3C HTML validator on one of the offending pages: http://validator.w3.org/check?uri=http%3A%2F%2Fphp.net%2Fmanual%2Fen%2Ffunction.array-merge.php&charset=%28detect+automatically%29&doctype=Inline&group=0 Previous Comments: ------------------------------------------------------------------------ [2014-01-31 18:12:18] cmbecker69 at gmx dot de | As for the 1st one; I have deleted the note that had the | invalid character point. Actually, at least the comment by rafmavCHEZlibre_in_france is most likely encoded as ISO-8859-1. The offending character is an é, which is quite common in several languages. There might be a lot more of these comments. Instead of deleting them, it might be worth transcoding them to UTF-8. ------------------------------------------------------------------------ [2014-01-31 17:05:20] bj...@php.net Your second problem sounds like: https://github.com/php/web-php/pull/32 As for the 1st one; I have deleted the note that had the invalid character point. Can you recheck in like 60minutes and see if it works? ------------------------------------------------------------------------ [2014-01-31 16:48:05] francois dot gannaz at silecs dot info After some debugging, there are 2 distinct problems: 1. Some comments on PHP pages are badly encoded. They contain invalid characters, which prevent the W3C validator (or iconv) from parsing the page. There such problems among the 420 Kb of "http://php.net/manual/en/function.array-merge.php". 2. The problem with "zero-width spaces" that get printed is probably related to CSS, not to encoding bugs. At least with Opera 15, disabling the following line fixes the display: body, input, textarea { font-family: "Source Sans Pro", "Helvetica", "Arial", sans-serif; } ------------------------------------------------------------------------ [2014-01-31 15:06:37] francois dot gannaz at silecs dot info Description: ------------ The encoding of most pages is invalid. The most frequent error is that each underscore character in the left column (e.g. the list of functions) is followed by an invalid byte. The behaviors of the web browsers varies. Most silently ignore the wrong bytes, and some display a special character for each error. The [W3C validator](http://validator.w3.org/) confirms the problem. Here is its answer when asked to validate "http://php.net/manual/en/function.array-merge.php": "Sorry, I am unable to validate this document because on line 3400 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication. The error was: utf8 "\xE9" does not map to Unicode" ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=66618&edit=1 -- PHP Webmaster List Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php