Edit report at https://bugs.php.net/bug.php?id=66618&edit=1
ID: 66618 Comment by: cmbecker69 at gmx dot de Reported by: francois dot gannaz at silecs dot info Summary: UTF-8 encoding error Status: Open Type: Bug Package: Website problem PHP Version: Irrelevant Block user comment: N Private report: N New Comment: I have not opened the ticket, but it seems reasonable to convert all old user contributed notes to UTF-8; the newer ones seem to be anyway. Previous Comments: ------------------------------------------------------------------------ [2014-02-03 14:50:11] bj...@php.net thanks for the explaination, but I'm still missing the actual bug here. Is this ticket for making sure all user contributed notes are utf-8? ------------------------------------------------------------------------ [2014-02-03 14:47:08] cmbecker69 at gmx dot de If a browser shall display a page in UTF-8 encoding, and there are unrecognized code points, it will substitute them by the Unicode replacement character U+FFFD[1]. You can see that yourself if you visit the array_search man page[2], and search for "greetz Udo". Just below this text is an "ANSI" encoded non-breaking space character (0xA0), which is displayed as <?>. [1] <http://www.fileformat.info/info/unicode/char/0fffd/index.htm> [2] <http://php.net/manual/en/function.array-search.php> ------------------------------------------------------------------------ [2014-02-03 14:23:18] bj...@php.net I can't imagine that a handful of latin-1 encoded characters in the user submitted data are causing your browser to present you weird data or broken navigation. ------------------------------------------------------------------------ [2014-02-03 09:14:27] francois dot gannaz at silecs dot info Removing one comment did not change the encoding problem. Here is the direct link to the W3C HTML validator on one of the offending pages: http://validator.w3.org/check?uri=http%3A%2F%2Fphp.net%2Fmanual%2Fen%2Ffunction.array-merge.php&charset=%28detect+automatically%29&doctype=Inline&group=0 ------------------------------------------------------------------------ [2014-01-31 18:12:18] cmbecker69 at gmx dot de | As for the 1st one; I have deleted the note that had the | invalid character point. Actually, at least the comment by rafmavCHEZlibre_in_france is most likely encoded as ISO-8859-1. The offending character is an é, which is quite common in several languages. There might be a lot more of these comments. Instead of deleting them, it might be worth transcoding them to UTF-8. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at https://bugs.php.net/bug.php?id=66618 -- Edit this bug report at https://bugs.php.net/bug.php?id=66618&edit=1 -- PHP Webmaster List Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php