Thanks a lot for the explanation, Harald. Indeed, Google's machine translation has been built upon simple statistical principles with large amount of training data, one of the advantage is you don't have to add layers and layers of linguistic rules as your system scale, the disadvantage (or the trade-off) is that you can't add many rules at all.
Having said that, improving translation quality has been No. 1 priority for the team all the time. Regards, Xi On Sep 5, 8:15 pm, Harald Korneliussen wrote: > I am almost certain they use public domain bible translations in their > parallel corpus, and maybe others as well. It is the most translated > book in the world, after all, it would be a very odd omission. > > But what you must remember is that while it's very easy for us to > decide whether "James" should be translated as "Santiago" or left as > "James" (as it probably should for "James Bond", for instance!), it is > very hard for a computer. Google uses statistical methods, sometimes > they will get it wrong. From my observations, I believe they were > indeed more aggressive in translating names before. But that can give > dramatic errors, such as the time "Bush" was translated as "Sarkozy" > in French, or a massacre by soldiers from Myanmar was "translated" > into a massacre by US soldiers. > > Have you seen the level of blind fury such mistranslations can cause? > Since Google Translate doesn't really have a widely published contact > address, lots of it has been posted to this forum. I can't exactly > blame Google Translate to prefer leaving names untranslated, over > "risky" translations that it has low confidence in. -- You received this message because you are subscribed to the Google Groups "General" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-translate-general?hl=en.
