If you want to sort Persian Unicode strings (correctly) on the server
side, this is for you. I just tested this on my up2date-d Red Hat 7.1
machine, and it works greatly. This is Linux-only, and the great thing is
again: "You won't need to install anything if you are running the latest
software".
1. Make sure you have PHP >= 4.0.5 and glibc >= 2.2.3.
2. Make sure you have the 'fa_IR' locale installed on your machine. It is
included with glibc. On my machine, it resides at:
/usr/share/i18n/locales/fa_IR
If you are using a distribution other than Red Hat, the file may be
somewhere else, but it's always named 'fa_IR'.
3. Copy the attached samples to a directory where you can run PHP, and
run the PHP script 'sort.php' remotely in a web browser (like
Mozilla 1.1beta or IE 5.0 or later). It reads the file 'sort.txt' and
sorts its entries based on Persian ordering of Arabic letters.
Notes:
* You can see a sample output at:
http://www.bamdad.org/~roozbeh/sort.php
(Make sure you have the latest version of Microsoft fonts if you are
using Microsoft fonts to view Persian pages, otherwise you may see
broken 'Yeh's.)
* Your output may be a little bit different from the one that can be seen
at the above URL, since there is a bug in handling Alef-Hamza vs
Waw-Hamza in the current glibc. This is due to a typo I had made in
the locale file I submitted to glibc in March 2001. I'm cleaning the
locale file and will send the patch to glibc people, to be included in
the next version.
* Behdad and me are working on a mini-Howto for sorting Persian data in
many environments including C, PHP, and PostgreSQL. Keep your fingers
crossed!
* The ordering is based on usage in references like Amid and Mo'in
Persian dictionaries. The test data are by Omid Milani, with some
additions by me.
Good luck,
roozbeh
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fa" lang="fa">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body dir="rtl"
style="font-family: Microsoft Sans Serif, Courier New, Tahoma, Times">
<?php
setlocale(LC_COLLATE, 'fa_IR'); // set the locale to Persian
$words = file('sort.txt'); // read the input file
srand ((float) microtime()*1000000); // set random seed
shuffle ($words); // shuffle the list randomly
usort($words, "strcoll"); // sort using local conventions
while (list($key, $value) = each($words)) { // display output
echo "$value<br>\n";
}
?>
</body>
</html>
برنامهیِ ما
وسائل
گاو
مأمن
برنامهیی
برنامهنویس
موسوی
وسایل
مومیایی
کتاب
وحدنا
موسی
حامل
مسئول
مسایل
عیسی
مسهل
موسا
برنامهئی
هرمز
مسائل
پشتی
مرتبه
مسئله
حائل
قرمز
وحده لا شریک له
مرتبت
مسبوق
عیسوی
مسؤول
برنامهای
موم
مسابقه
برنامهها
عیسا
مؤمن
یاران
مسأله
وحدة ملت الاسلامی
وساطت
حایل
برنامه ما
مرتبة
مسوول
وحدت
ماموت