Hi, I regard categorical ambiguity (part of speech ambiguity) as a special case of polysemi. What's important, is translating to a word with the right meaning.
Yes, I intend to try Fran's new lexical selection module. But I was just thinking that the current work flow is a bit odd: 1. Wouldn't it be more adequate to begin with finding the right word, rather than trying to fix it afterwards with a lexical selection module? Yes, this is a new work flow with a disambiguator, rather than a tagger, choosing the right word and indirectly deciding part of speech. (Rather than the opposite). Or alternatively: 2. Why not collect all possible translation options and evaluate them, choosing the translation that seems most meaningful or fluent? (Something like what's done in statistical translation by weighting the translations by the language model.) Yes, this is a new "parallel" work flow with several competing translations, evaluated in the end. BTW I don't like the idea of using a constraint grammar. I hope something more automatic could be invented. Yours, Per Tunedal On Sun, May 12, 2013, at 19:46, Mikel Forcada wrote: > Al 05/12/2013 02:26 PM, En/na Per Tunedal ha escrit: > > A > > better disambiguation would be most helpful. Maybe it would be possible > > to translate all possible matches, disregarding the part of speech, and > > later choose the translation that makes most sense/is the most fluent in > > the target language? Or use a disambiguator instead of the tagger? I > > will gladly discuss this in a separate thread. > Are we talking about categorial ambiguity or polysemy here? > > As regards categorial ambiguity, the transfer part of Apertium requires > that only one lexical form is present, so the flow you suggest is not > currently possible. However, it is possible to train the tagger in > different ways (using a hand-tagged corpus, or using the target language > [apertium-tagger-training-tools]). In addition to that, there are ways > to remove and select lexical forms before the statistical tagger using > constraint grammar. > > As regards polysemy (a property of the lemma), Fran is about to release > the lexical selection module he has been developing as part of his PhD > thesis. One can use it with handwritten rules, or train it using > parallel or monolingual corpora. > > Best, > > Mikel > > -- > Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/) > Departament de Llenguatges i Sistemes InformĂ tics > Universitat d'Alacant > E-03071 Alacant, Spain > Phone: +34 96 590 9776 > Fax: +34 96 590 9326 > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and > their applications. This 200-page book is written by three acclaimed > leaders in the field. The early access version is available now. > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
