I'm working on an application where I need to replace some Unicode
characters with a PHP shell script.
The problem I'm having is matching multibyte characters.
One character in particular is Unicode 2014 an m-dash '—'
To get the decimal code to identify the character I've tried two
methods, one was with a little script like this:
<?php
$x=0;
while($letter = substr('—',$x,1)){
echo "$letter\t" . ord($letter) ."\n";
$x++;
}
?>
I took the multiple decimal codes returned and tried a replacement like:
echo str_replace(chr(226).chr(128).chr(148),"-hyphen",$data);
But that didn't match it I figured my method of finding the decimal
value wasn't correct.
Next I thought if I UTF-8 encoded the character and took the ord() and
used that decimal to compare against a utf-encoded version of my string.
but when I tried getting the ord() of all the characters I wanted to
watch for it always gave me the same decimal, 195. So that idea wasn't
going to work either.
I'm out of ideas, any input would be appriciated. Thanks.
--
Jeff Bearer, RHCE
Webmaster, PittsburghLIVE.com
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php