2010/11/23 రహ్మానుద్దీన్ షేక్ <[email protected]>
> I have this code in php to read the length of a string in utf-8,
>
> <?php
> header("Content-Type: text/html; charset=utf-8");
> $a = "தமிள்";
> $c = strlen(utf8_decode($a));
> echo "The string input: ".$a." and its length:".$c;
> ?>
>
> But it returns out put as : The string input: தமிள் and its length:5
> Now I want to have the out put as 3 but not as 5, how to achieve that?
>
>
This is to do with Unicode Tamil. Not PHP. It's the same in every indic
language.
Tamil is stored in Unicode strings in it phonetic / grammatical form.
So, தமிள் is actually stored as Tha + (im + e) + (la + dot). That makes up
the 5 characters.
The only way to achieve what you want is to write a strlen function which
understands tamil grammar.
Regards,
Arun Venkataswamy.
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc