I'm working on an application where I need to replace some Unicode characters with a PHP shell script.
The problem I'm having is matching multibyte characters. One character in particular is Unicode 2014 an m-dash '—' To get the decimal code to identify the character I've tried two methods, one was with a little script like this: <?php $x=0; while($letter = substr('—',$x,1)){ echo "$letter\t" . ord($letter) ."\n"; $x++; } ?> I took the multiple decimal codes returned and tried a replacement like: echo str_replace(chr(226).chr(128).chr(148),"-hyphen",$data); But that didn't match it I figured my method of finding the decimal value wasn't correct. Next I thought if I UTF-8 encoded the character and took the ord() and used that decimal to compare against a utf-encoded version of my string. but when I tried getting the ord() of all the characters I wanted to watch for it always gave me the same decimal, 195. So that idea wasn't going to work either. I'm out of ideas, any input would be appriciated. Thanks. -- Jeff Bearer, RHCE Webmaster, PittsburghLIVE.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php