Luis,
In addition to the multi-engine MT by Gabriel, there have been many
avenues to hybridization involving Apertium, and Felipe Sánchez-Martínez
has been part of most of them (check his webpage
http://www.dlsi.ua.es/~fsanchez/). I'll answer now, and he can complete
my answer later:
1. Felipe, Juan Antonio Pérez-Ortiz and I used nondeterministic
output followed by scoring with a statistical target-language
model to train the part-of-speech tagger of Apertium. However, he
managed to transfer the scores to the part-of-speech tagger so
that it would only deliver one analysis, with similar results.
The whole thing is implemented and is part of Apertium: Felipe
will tell you which packages. The main paper is:
* http://www.springerlink.com/content/m452802q3536044v/fulltext.pdf
2. Another thing that Felipe Sánchez-Martínez did was to mix
translation units from a corpus with Apertium output. We published
a paper on this;
* http://www.dlsi.ua.es/~fsanchez/pub/pdf/sanchez-martinez09d.pdf
3. Finally, Felipe's student Víctor Sánchez Cartagena has been
working hard in hybridization, adding Apertium-generated
translation units to a statistical MT system (the resulting
system, Alacant, was one of the best systems in the WMT 2011
contest for Spanish--English, see
http://www.mt-archive.info/WMT-2011-Callison-Burch.pdf):
* http://www.dlsi.ua.es/~fsanchez/pub/pdf/sanchez-cartagena11c.pdf
* http://www.dlsi.ua.es/~fsanchez/pub/pdf/sanchez-cartagena11b.pdf
* http://www.dlsi.ua.es/~fsanchez/pub/pdf/sanchez-cartagena11a.pdf
Hope this has helped!
Good luck
Mikel L. Forcada
On 10/09/2011 12:17 AM, Luis Chiruzzo - Inco wrote:
Hello,
My name is Luis and I'm very interested in this project and what it
can do, it's really amazing!
I wanted to know if there have been attempts to combine the rule-based
approach of apertium with some statistical processing. For example,
has anybody tried to get a non-deterministic output from apertium
rules, and then use another method to choose between the possible
outcomes? I really would like to try this out. I've been googling to
know if someone took that approach, but I haven't been able to find
anything on the subject.
Thanks,
Luis
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff