From: [EMAIL PROTECTED] Operating system: WinXP PHP version: 4.3.4 PHP Bug Type: *XML functions Bug description: Possible bug in utf8_encode (bit operations)
Description: ------------ Hi! I'm currently developing a nice script that generates OpenOffice SXW files by filling the content.xml (which is UTF-8 encoded) with database content. While trying to do this I found out that utf8_encode('') (charcode 147) returns ''. But when I checked the whole result in OffenOffice '' is displayed as square (character unknown?!). So I made some tests with UTF-8 conversion (even mb_* functions) and recognized that characters between 128 and 160 returned by utf8_encode() dont seem to match the standard. As mentioned above '' is returned as '' but should be '’' (as you will get it using UltraEdit for conversion). Does anyone can give me some explanations here? Im not familiar with this UTF-8 / bit-conversion stuff, but I dont think PHP does what its supposed to do here. For a first workaround I simply coded a custom_utf8_encode() that uses an own char map to override this misbehaviour (see below). Can someone help my out with this strange bug?! Regards Bjoern Kraus function custom_utf8_encode($str) { $chrMap = array(128 => '', 129 => '', 130 => '‚', 131 => 'ƒ', 132 => '„', 133 => '…', 134 => ' ', 135 => '‡', 136 => 'ˆ', 137 => '‰', 138 => ' ', 139 => '‹', 140 => 'Œ', 141 => '', 142 => 'Ž', 143 => '', 144 => '', 145 => '‘', 146 => '’', 147 => '“', 148 => '”', 149 => '•', 150 => '–', 151 => '—', 152 => '˜', 153 => '™', 154 => 'š', 155 => '›', 156 => 'œ', 157 => '', 158 => 'ž', 159 => 'Ÿ'); $newStr = ''; for ($i = 0; $i < strlen($str); $i++) { $chrVal = ord($str[$i]); if ($chrVal > 127 && $chrVal < 160) { $newStr .= $chrMap[$chrVal]; } else { $newStr .= utf8_encode($str[$i]); } } return $newStr; } -- Edit bug report at http://bugs.php.net/?id=28654&edit=1 -- Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=28654&r=trysnapshot4 Try a CVS snapshot (php5): http://bugs.php.net/fix.php?id=28654&r=trysnapshot5 Fixed in CVS: http://bugs.php.net/fix.php?id=28654&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=28654&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=28654&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=28654&r=needscript Try newer version: http://bugs.php.net/fix.php?id=28654&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=28654&r=support Expected behavior: http://bugs.php.net/fix.php?id=28654&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=28654&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=28654&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=28654&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=28654&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=28654&r=dst IIS Stability: http://bugs.php.net/fix.php?id=28654&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=28654&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=28654&r=float