Is there any way to correct such a "joke" text that's been submitted
to Google?  I keep running into more of these.  For example, punch in
"sigur rós", Icelandic to English (Sigur Rós is an Icelandic band; a
(grammatically incorrect) translation would be "Victory Rose").  The
Google result?  "Foo Fighters" (a different, non-Icelandic band).

If there's no way this can be fixed, Icelandic translations should
simply be taken down.

 - Karen

On Apr 14, 12:11 am, Glossika Languages wrote:
> Let me provide you some insight. I'm not affiliated with Google, but
> I'm a polyglot and I understand how these systems work.
>
> Google translate works really well when you know how to use it and
> what kinds of texts it can translate. Basically, the more colloquial
> something is, the more difficult it will be, for obvious reasons I
> explain below.
>
> The translation service uses lexicostatistical methods based on
> parallel texts. In other words, there's a huge database with all the
> sentences that have ever been written and their translations next to
> them. This huge database is mostly English and every other language.
> The greatest number of parallel-text documents are reports, news, any
> government translations, technical stuff, patents, etc etc. But for
> some languages like Icelandic, the amount of parallel texts is much
> smaller than for more commonly used languages. The downside to the
> translate feature is that the sentence you're requesting never
> appeared in any report (especially sentences with "you" and "I" and
> "could you tell me where" etc) and is generally only spoken, Google's
> algorithm kicks into play grabbing the most common sentence structure
> and replacing the vocabulary. I believe a future add-on might be a
> double cross-check with actual usage in the blogosphere as this
> represents more spoken language. But Google is already implementing
> deciphering of all YouTube videos of spoken language which would be
> even more accurate than the blogosphere. It will be some time for this
> to mature however.
>
> If you translate between two other languages like German and Polish,
> it will end up going through English first which may erase some of the
> verb conjugations that may exist in those two languages. Some
> languages like Belorussian and Ukrainian all go through Russian as an
> intermediate language, not English, so the verb conjugations and
> declensions all stay intact (but not between Russian and Polish).
>
> Getting back to your Icelandic. Many people have offered parallel
> texts directly into Google's database. Google rarely builds the
> contents of these databases themselves, but rather mines them from
> users in any way they can. If you go in there and look around, you can
> see there are some 300 languages you can provide parallel texts for.
> Once they get enough of a database built up, they can release a
> language in beta. These texts could potentially contain errors or even
> deliberate jokes. You can imagine that some kid sitting in Iceland is
> having fun inputting a whole book of jokes in both English and
> Icelandic and that ends up as the only source in Google's database of
> such sentences. When you type an exact match, Google spits out the
> exact match for you.
>
> On Apr 13, 6:37 am, "[email address]" wrote:
>
>
>
>
>
>
>
> > Someone at Google is having fun playing with the translations returned
> > from English->Icelandic.  I've run into a number of things for which
> > the only possible explanation is someone hard-coding bogus answers
> > in.  Here's what I found first:
>
> > "Where" returns "Hvar" ("Where")
> > "Where is" returns "Hvar er" ("Where is")
> > "Where is the" returns "Hvar er" ("Where is")
> > "Where is the bathroom?" returns "Talarðu ensku?" ("Do you speak
> > English?", from ađ tala, to speak, and the accusative form of enska,
> > "English")
>
> > Har-di-har-har, Google.  You can ask other "Where is" questions and it
> > translates them correctly, but they hard-coded "Where is the bathroom"
> > to a joke answer.
>
> > Want more?  Here's a more blatant example.
>
> > "My" returns "My" (the personal possessive pronoun is a postposition
> > in Icelandic)
> > "My hovercraft" returns "Láttu mig" ("Let me", from ađ láta).
> > "My hovercraft is full of eels" returns "Láttu mig í
> > friði!" (basically, "Leave me in peace!")
>
> > There's absolutely no way that's an accident.  I mean, they even added
> > in an exclamation point!  And there's no way that Google has a
> > dictionary where "Láttu" means "Hovercraft".
>
> > Want more?  I keep finding these things.  Someone at Google apparently
> > thinks this service is a big joke.  Accidental translation mistakes
> > are one thing, but deliberate ones are an entirely different story
> > altogether, and these are clearly not accidental.

-- 
You received this message because you are subscribed to the Google Groups 
"General" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-translate-general?hl=en.

Reply via email to