ID:               41554
 Updated by:       [EMAIL PROTECTED]
 Reported By:      victorepand at gmail dot com
-Status:           Open
+Status:           Feedback
-Bug Type:         *Languages/Translation
+Bug Type:         Strings related
 Operating System: Linux
 PHP Version:      4.4.7
 New Comment:

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.




Previous Comments:
------------------------------------------------------------------------

[2007-06-01 01:32:55] [EMAIL PROTECTED]

My gut reaction to your problem is to mention that you've probably
mixed up ISO 8859-1 and Windows-1252: the two are commonly confused for
each other, the Windows encoding containing several more characters:
€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ

However, said behavior does not precisely match up with your
predicament, as © and ® are part of ISO 8859-1. Furthermore, the URL you
supplied is already encoded in UTF-8. Perhaps you are double encoding?

Either way, this is not a problem with the documentation, except
possibly the fact that the user notes are waaaaay to long on utf8_encode
and some of the info needs to be integrated into the main docs.

------------------------------------------------------------------------

[2007-06-01 00:57:31] victorepand at gmail dot com

Description:
------------
I have used the function utf8_encode to encode iso-8859-1 pages into
UTF-8 and displayed them on my site, but strange and funny characters
are appearing such as "" and "Â".

It turns out that the iso-8859-1 page contains the use of characters
such as these:
©,’,—,“,”,®,™,…
These characters display fine on my browser from the iso-8859-1 page,
but when I use the utf8_encode function and display it on my utf-8 page,
the result is garbled.

So I have found the only solution is to manually convert all of the
characters above before using the utf8_encode function and that solves
the problem crudely, but it is not a perfect solution. What if I have
missed any characters? Isn't there a cleaner method, a PHP function,
that will do all this conversion without worry and without missing any
characters?



Reproduce code:
---------------
Here is an example of an iso-8859-1 page which displays fine on my
browser, but contains such characters such as ©,’,—,“,”,®,™,… mentioned
above:
http://www.jardenstore.com/product.aspx?bid=18&pid=1251


Expected result:
----------------
After using the utf8_encode function, I expected to see the page
displaying correctly again on my UTF-8 page with these characters
intact: ©,’,—,“,”,®,™,… 

Actual result:
--------------
Instead, the result was garbled like this:
‘,—,–,’,Â,â€â„¢,â€â„¢,â€,é,ð,™,œ,,è,Ž,Â


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=41554&edit=1

Reply via email to