On Sun, 18 Jan 2004, Ebadat A.R. wrote: > Hi, > You can search for ge and pe and jh and che , if you find it is Persian. > If you want to have an accurate program, you need a Persian words document. It is > easy to make on for yourself. Just write a program to collect Hamshahri newspaper > content and extract words of Hamshahri Newspaper. you can use these words for your > programs. > For every text , check all words in it and if it is in Persian words you can add a > number to Persian rank of your text and if it is not Persian words , subtract a > number from Persian rank. > Finally you have a number (you should normalize this number before) to help to find > that if your text is Persian or not.
Matching Persian words is really hard (Persian Yeh vs Arabic Yeh, Persian Kaf vs Arabic Kaf, ...) > Regards, > Ebadat A.R. > http://www.WebRrah.com/machine-translation (Machine Translation weblog) _______________________________________________ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing