Re: How Google Translate handles "variants" of Armenian?

edd Tue, 21 Dec 2010 02:43:19 -0800

Hi Josh

Many thanks for prompt response!


I do appreciate the difficulty in handling these aspects. You seem to
be using ISO 639-1, which allocates code "hy" for all variants of
Armenian. ISO 639-3 is more granular and has different codes for the
different variants.

I guess in cases when the same word has the same meaning in all three
variants there is no problem; but when a particular word has different
meanings then that would affect the quality of the translation and I
don't know how effective statistical methods would be to address this.
Yesterday I read a couple of pages and while the translation gave a
general idea of the text the details/specifics weren't particularly
good and some words were just transliterated because I assume they
could not have been translated.

I wonder if you could give us an idea of how close Armenian is to
moving from the "alpha" stage to the standard supported list and
whether there is anything Armenian speakers could do to assist?

Best regards

e



On Dec 21, 12:34 am, Josh (Google Employee) wrote:
> Hi edd,
>
> Great question.  Unfortunately the answer is likely "we don't", that
> is, we don't handle the variants ofArmenian.
> For many languages we support, there can be several variants,
> unfortunately the difficulty for us is often that it is difficult to
> tell the difference between documents written in the different
> variants in an automatic way.  This makes it difficult for us to
> collect training data for our system that can separate out all the
> variants.  It is also likely that even if we could tell the difference
> between the variants, that some variants we simply would not find
> enough data to do anything effective with.
>
> The result is we likely find documents of all three variants ofArmenianand we 
> train our system on all of them – which probably
> results in a system dominated by which ever variant has the most
> content published on the web.
>
> Hope that helps your understanding, and my apologies that we cannot
> more easily provide a way to more effectively handle these variants.
> If you're interested in more information, check 
> outhttp://translate.google.com/about/,
> which has a nice description of how our system works.
>
> Best,
> Josh Estelle
> Senior Software Engineer
> Google Translate
>
> On Dec 20, 6:43 am, edd wrote:
>
> > Hello,
> > I am delighted Google Translate has introduced support forArmenian.
> > However considering there are three "variants" ofArmenian- Classical
> >Armenian, EasternArmenianand WesternArmenian- how do you handle
> > this "three in one" situation? Some words that are present in one
> > variant are unknown or have different meaning in the other
> > variant(s).
> > ClassicalArmenianis the written language of the Bible, the church
> > and the classical literature.
> > EasternArmenianis the language of the Republic of Armenia.
> > WesternArmenianis the language of the largeArmenianDiaspora.
> > To complicate matters further EasternArmeniancan be written using
> > the classical or the modern orthography.
> > So my question is how do you handle all this in Google Translate under
> > the single language code ("hy")?

-- 
You received this message because you are subscribed to the Google Groups 
"General" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-translate-general?hl=en.

Re: How Google Translate handles "variants" of Armenian?

Reply via email to