It means that you have to put short vowels in. Not all Arabic text has got it's short vowels. The Qur'an I know has. If you go down this route you might as well say, why not force the user to disambiguate all words.
A far better approach is the use of LSA. The irony is this. If I were to search for an article on agriculture. Lets say http://www.vancouversun.com/news/Cracking+wheat+genome+could+improve+food+security+scientists/3453208/story.html Google would index this using LSI. http://manybooks.net/titles/uchardm2186821868-8.html There again Google search sorts and puts it at the top of the list. The great irony is that Google does not use the concepts that it has developed for search technology in translation. LSI id by the way the same thing as LSA. Every website that Google translates has a vector associated with it which it uses for search. Each of the examples I have given has vectors associated with them حب therefore would mean different things i n the two articles. حب has a number of vector matches. If you meet (for example) جمىلة you would associate it with love and sex. Google is firmly committed to what it terms statistical translation. That is to say it takes bilingual test and matches words. Statistical translation takes care (to some extent) of local words, but not global context. One additional point. Statistical translation has no concept of grammar. A lot of disambiguation can be done purely by looking at grammar, agreement etc. One last point what to do. I think that it would be futile to try to get Google to change its strategy. Far better would be to try to get Arabic included in MOLTO. The EU has just started a language translation project. This is based on a common grammar and (I would presume. Google by the way translates everything into English first. If you wanted to translate جذلان into French you might get "Homesexual". "Gay" the literal translation of جذلان also means "homosexual" in English. Quite clearly this is unacceptable in the EU. We English have been castigated for not speaking other languages. Our commissioner Lady Ashton who is Foreign Minister of the EU, has been forced to go on a retreat to improve her French. The idea of translating everything into English is totally abhorrent. A Von Neumann (context free) language seems to me to be the only option MOLTO has. What Arab country are you from? If it is a MINA (Mediterranean) country, might I suggest that you lobby your own government + MOLTO, MINA and the EU to include Arabic as a MOLTO language. All EU languages are included + Russian. Arabic is, if anything, more EU than Russian in view of the MINA connection. Pressure from governments would, I am sure, get Arabic included. Hofstatter has said that understanding of Natural Language will produce AGI (Artificial General Intelligence). NOT going down the "English" route as Google has, would be a step towards AGI. If MOLTO works therefore it will be a milestone on the AGI route. I have myself written a program which could be used to do just what you are suggesting. It takes Arabic text (no short vowels) and finds, based on Buckwalter's dictionary, all the possible translations into English. It also splits an Arabic word into stems and inflexions. A Java program asks you to choose a meaning. If instead of English you chose Arabic words with accents for short vowels + a Dewey type code for meaning (displayed as Arabic sentences, when the moose was over the word) it would go a long way to doing what you suggest. Of course a Near Human Quality translator would do this for you. - Ian Parker In general disambiguation is done both locally and by word association On Aug 27, 10:48 pm, ahmed wrote: > In Arabic language the diacritical marks plays an important role in > many cases, especially in cases of vowels, ablaut and slurring.Because > of > that, a lot of errors occur in the case of translation, especially > when the > words have the same letters, but the meaning is different, like the > words > "ÍõÈ" which mean love and " ÍóÈ" which mean type of grain, > unfortunately in > the existing translation process, they are same"love" .so i suggest a > new > way which we can deal with Arabic text with diacritical marks to use > it in > translation Arabic diacritical word or text. > > please tell me how and wher to send this > suggestion with thanks. -- You received this message because you are subscribed to the Google Groups "General" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-translate-general?hl=en.
