Edit report at https://bugs.php.net/bug.php?id=66618&edit=1

 ID:                 66618
 Comment by:         cmbecker69 at gmx dot de
 Reported by:        francois dot gannaz at silecs dot info
 Summary:            UTF-8 encoding error
 Status:             Open
 Type:               Bug
 Package:            Website problem
 PHP Version:        Irrelevant
 Block user comment: N
 Private report:     N

 New Comment:

If a browser shall display a page in UTF-8 encoding, and there are
unrecognized code points, it will substitute them by the Unicode
replacement character U+FFFD[1].

You can see that yourself if you visit the array_search man page[2],
and search for "greetz Udo". Just below this text is an "ANSI" 
encoded non-breaking space character (0xA0), which is displayed as
<?>.

[1] <http://www.fileformat.info/info/unicode/char/0fffd/index.htm>
[2] <http://php.net/manual/en/function.array-search.php>


Previous Comments:
------------------------------------------------------------------------
[2014-02-03 14:23:18] bj...@php.net

I can't imagine that a handful of latin-1 encoded characters in the user 
submitted data are causing your browser to present you weird data or broken 
navigation.

------------------------------------------------------------------------
[2014-02-03 09:14:27] francois dot gannaz at silecs dot info

Removing one comment did not change the encoding problem. Here is the direct 
link to the W3C HTML validator on one of the offending pages:
http://validator.w3.org/check?uri=http%3A%2F%2Fphp.net%2Fmanual%2Fen%2Ffunction.array-merge.php&charset=%28detect+automatically%29&doctype=Inline&group=0

------------------------------------------------------------------------
[2014-01-31 18:12:18] cmbecker69 at gmx dot de

| As for the 1st one; I have deleted the note that had the 
| invalid character point.

Actually, at least the comment by rafmavCHEZlibre_in_france is most
likely encoded as ISO-8859-1.  The offending character is an é, 
which is quite common in several languages.  There might be a lot 
more of these comments. Instead of deleting them, it might be 
worth transcoding them to UTF-8.

------------------------------------------------------------------------
[2014-01-31 17:05:20] bj...@php.net

Your second problem sounds like: https://github.com/php/web-php/pull/32

As for the 1st one; I have deleted the note that had the invalid character 
point.

Can you recheck in like 60minutes and see if it works?

------------------------------------------------------------------------
[2014-01-31 16:48:05] francois dot gannaz at silecs dot info

After some debugging, there are 2 distinct problems:

1. Some comments on PHP pages are badly encoded. They contain invalid 
characters, which prevent the W3C validator (or iconv) from parsing the page. 
There such problems among the 420 Kb of 
"http://php.net/manual/en/function.array-merge.php";.

2. The problem with "zero-width spaces" that get printed is probably related to 
CSS, not to encoding bugs. At least with Opera 15, disabling the following line 
fixes the display:
body, input, textarea { 
    font-family: "Source Sans Pro", "Helvetica", "Arial", sans-serif;
}

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=66618


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=66618&edit=1

-- 
PHP Webmaster List Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to