This is a theory. Google has a different theory that is backed up by results. The size of the sentence-aligned corpus determines the quality of the translation. The algorithms are entirely secondary.
In the absence of a sentence aligned corpus one must be created. People want good machine translations but such translations require people to first do part of the work. It's a perfectly reasonable symbiotic relationship. There is no reason to expect that this project 1) won't help Google and 2) won't help Wikipedia. On Tue, Jun 9, 2009 at 3:57 PM, Amir E. Aharoni <[email protected]>wrote: > On Wed, Jun 10, 2009 at 00:26, Brian<[email protected]> wrote: > > Honestly, I should have learned by now to ignore comments like this. > Google > > is the leading world expert on machine translation and they think it's a > > good idea. I understand why they think it's a good idea, you don't. > You're > > shooting straight from the gut. > > Not quite - i am finishing a degree in Linguistics and i work as an > NLP programmer, so i know the field a little. > > Google is the leading world expert in searching vast amounts of text > in English, a language with next to no morphology. They aren't as good > at searching in Hebrew, Spanish and Russian. And their translation > software doesn't even cover Persian, a language with a relatively > simple morphology. > > Google appear to assume that the statistical approach to machine > translation is the only one that matters and that their leadership in > search technologies makes them the leaders in machine translation. > They are wrong. The statistical approach helps, but humans don't think > only statistically. The grammars of even the best-researched languages > - English, French, German - are ridiculously far from being described > completely. When i say "grammar", i refer to the whole language > system: morphology, syntax, semantics, discourse analysis, typography, > prosody, phonology and more. We can't teach computers grammar, because > we don't really understand it ourselves, and without teaching > computers proper grammar, the statistical approach is very limited. > > Google improved their translation software a little in the last couple > of years but they are many, many years away from being able to > translate a real text. Google translation paired with something like > [[Universal Networking Language]] or maybe OmegaWiki may yield better > results, but it will take many more years to complete. Of course, > something may change and Big Companies may start pouring a lot of > money into dictionary and grammar book writers. Until that happens, > expect improvements in machine translation to be Very Slow. > > -- > אמיר אלישע אהרוני > Amir Elisha Aharoni > > http://aharoni.wordpress.com > > "We're living in pieces, > I want to live in peace." - T. Moore > > _______________________________________________ > foundation-l mailing list > [email protected] > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l > _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
