It means that you have to put short vowels in. Not all Arabic text has
got it's short vowels. The Qur'an I know has. If you go down this
route you might as well say, why not force the user to disambiguate
all words.

A far better approach is the use of LSA. The irony is this. If I were
to search for an article on agriculture. Lets say
http://www.vancouversun.com/news/Cracking+wheat+genome+could+improve+food+security+scientists/3453208/story.html

Google would index this using LSI. 
http://manybooks.net/titles/uchardm2186821868-8.html
There again Google search sorts and puts it at the top of the list.

The great irony is that Google does not use the concepts that it has
developed for search technology in translation. LSI id by the way the
same thing as LSA. Every website that Google translates has a vector
associated with it which it uses for search. Each of the examples I
have given has vectors associated with them حب therefore would mean
different things i n the two articles. حب has a number of vector
matches. If you meet (for example)  جمىلة you would associate it with
love and sex.

Google is firmly committed to what it terms statistical translation.
That is to say it takes bilingual test and matches words. Statistical
translation takes care (to some extent) of local words, but not global
context.

One additional point. Statistical translation has no concept of
grammar. A lot of disambiguation can be done purely by looking at
grammar, agreement etc.

One last point what to do. I think that it would be futile to try to
get Google to change its strategy. Far better would be to try to get
Arabic included in MOLTO. The EU has just started a language
translation project. This is based on a common grammar and (I would
presume. Google by the way translates everything into English first.
If you wanted to translate جذلان into French you might get
"Homesexual". "Gay" the literal translation of جذلان also means
"homosexual" in English.

Quite clearly this is unacceptable in the EU. We English have been
castigated for not speaking other languages. Our commissioner Lady
Ashton who is Foreign Minister of the EU, has been forced to go on a
retreat to improve her French. The idea of translating everything into
English is totally abhorrent. A Von Neumann (context free) language
seems to me to be the only option MOLTO has.

What Arab country are you from? If it is a MINA (Mediterranean)
country, might I suggest that you lobby your own government + MOLTO,
MINA and the EU to include Arabic as a MOLTO language. All EU
languages are included + Russian. Arabic is, if anything, more EU than
Russian in view of the MINA connection. Pressure from governments
would, I am sure, get Arabic included.

Hofstatter has said that understanding of Natural Language will
produce AGI (Artificial General Intelligence). NOT going down the
"English" route as Google has, would be a step towards AGI. If MOLTO
works therefore it will be a milestone on the AGI route.

I have myself written a program which could be used to do just what
you are suggesting. It takes Arabic text (no short vowels) and finds,
based on Buckwalter's dictionary, all the possible translations into
English. It also splits an Arabic word into stems and inflexions. A
Java program asks you to choose a meaning. If instead of English you
chose Arabic words with accents for short vowels + a Dewey type code
for meaning (displayed as Arabic sentences, when the moose was over
the word) it would go a long way to doing what you suggest. Of course
a Near Human Quality translator would do this for you.


  - Ian Parker

In general disambiguation is done both locally and by word association
On Aug 27, 10:48 pm, ahmed wrote:
> In Arabic language the diacritical marks plays an important role in
> many cases, especially in cases of vowels, ablaut and slurring.Because
> of
> that, a lot of errors occur in the case of translation, especially
> when the
> words have the same letters, but the meaning is different, like the
> words
> "ÍõÈ" which mean love and " ÍóÈ" which mean type of grain,
> unfortunately in
> the existing translation process, they are same"love" .so i suggest a
> new
> way which we can deal with Arabic text with diacritical marks to use
> it in
> translation Arabic diacritical word or text.
>
> please tell me how and wher to send this
> suggestion with thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"General" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-translate-general?hl=en.

Reply via email to