There was a lot of answers speaking abouts statistic tools of Apertium
that I didn't know. I will give another kind of answer not as an
Apertium developer (not yet !), but as a user of different automatic
translators.

Google translator is known to be a statistic translator.
Others like Apertium or the french commercial Systran are rule and
lexical based translators.

I used all of them to translate my websites from french.
Even if on rule and lexical based translator, I sometimes find words
not corectly translated (a problem of homonyms, but in that case it
is easy to replace a word by another every time it comes), or not
translated at all with apertiun (problem of coverage), it is with
google translator that I had the worst surprises.

Sometimes a part of a sentence is completly forgotten. Sometimes two
different words are used in english translation, depending on the
sentence, to translate the same french word !

Once or twice, Google gave me a nice result by entirely reformulating
a complete sentence, but more often in that case it's rather a random
translation without real meaning.

English and french langage have differences in the order of words
(between nouns and adjectives for instance), and every time there is
a change of word order, there can be change of meaning with a translator.
When using google, I often saw translation where the result would have
been better using the same order of words than on the source text !

I don't know if google is better from english to french than from french
to english. Presently, I translate Apertium wiki in french. The result
of two translators gives me ideas.

I don't know if Francis Tyers made special efforts to be understood by
non English natives when he wrote a large part of the wiki, but in general
these pages are easy to follow, and google often makes well formulated
sentences. But when I think somewhere another word should be used, this
word is often the one choosed by the determinist translator Systran.

There are still few sentences that I don't understand and that gives
stranges results with the two kinds of translators. For the sentence
"The languages are related with varying levels of mutual intelligibility",
one problem was to choose a good word (not to literal) to "related",
the other to chose a good order of the 5 last words. From 1 2 3 4 5
in the original text, it becomes 2 3 5 4 1 in my translation. Too
much complicated for an automatic translator of any kind.


Speaking now more about Apertium, I translated a page speaking about a
tool called GetAlignmentWithText.

I noticed this tool is particularily slow. I didn't follow what it was
for exactly. But if this toll was done to find words and their translations
by comparing the position of the different words in the original text and
the translated version, that explains why the process is slow, and that's
a kind of statistical processing done to get a word correspondance, and may
be (why not) transfer rules.



> Date: Sat, 8 Oct 2011 20:17:16 -0200
> From: Luis Chiruzzo - Inco <[email protected]>
> To: [email protected]
> Reply-To: [email protected]
> Subject: [Apertium-stuff] Statistical Apertium
>
> Hello,
>
> My name is Luis and I'm very interested in this project and what it can do,
> it's really amazing!
>
> I wanted to know if there have been attempts to combine the rule-based
> approach of apertium with some statistical processing. For example, has
> anybody tried to get a non-deterministic output from apertium rules, and
> then use another method to choose between the possible outcomes? I really
> would like to try this out. I've been googling to know if someone took that
> approach, but I haven't been able to find anything on the subject.
>
> Thanks,
> Luis
>
--------------------------------
Bernard Chardonneau (France)
Phone : [33] 1 64 90 87 04 (from Sept to June except holidays)
GSM phone : [33] 6 49 95 13 95 (french scholl holidays, C zone)

Multilingual websites for my free softwares :
http://libremail.free.fr and http://libremail.tuxfamily.org
http://cyloop.tuxfamily.org (mainly translated with Apertium)

My general website (in french only)
http://bech.free.fr

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to