Yes, indeed it is. The wikipaedia reference is
http://en.wikipedia.org/wiki/Google_Translate
You can look up the work of Franz Joseph Och yourself. This is a good
starting reference.
Statistical translation requires large databases. "When she was good
she was very very good and when she was bas she was horrid". This
seems to sum up statistical translation. It makes gross errors (like
getting the names of countries wrong) which a simple system,
dictionary based, would never get wrong. It is also prone to hacking.
In statistical translation, because it is statistical assumptions
creep in. Women can't drive more often than men or perhaps they admit
it more often! This does form a part of statistical word association.
Mind with Arabic NEVER being OVS and the case of the settler killing a
Palestinian - clearly the result of a hack. The Israelis therefore
have little cause for complaint.
Google if it is to maintain leadership in translation will have to
combine statistical translation with other approaches. I suggest 2
thins.
1) Generic classes eg. {country=Angleterre}.
2) Use LSA which Google uses for search, but not for translation.
- Ian Parker
On Sep 30, 1:55 am, gameswithwords wrote:
> I'm discussing, in passing, statistics-based approaches to machine
> translation (e.g., rather than generative grammar-based) in an
> upcoming general-public science article. From the little I've found
> online so far, it looks like Google Translate is what I'd call a
> statistics-based approach (based largely on statistical cooccurences
> between words, rather than anything with complex structure).
>
> But, since I can't find a more in-depth report, I'm not sure, and I'd
> rather be sure before publication. Can anyone give me more information?
--
You received this message because you are subscribed to the Google Groups
"General" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/google-translate-general?hl=en.