[Apertium-stuff] Choosing the right word was: Re: Status of the pair sv-da

Per Tunedal Sun, 12 May 2013 11:21:03 -0700

Hi,
I regard categorical ambiguity (part of speech ambiguity) as a special
case of polysemi. What's important, is translating to a word with the
right meaning.


Yes, I intend to try Fran's new lexical selection module. But I was just
thinking that the current work flow is a bit odd:

1. Wouldn't it be more adequate to begin with finding the right word,
rather than trying to fix it afterwards with a lexical selection module?
Yes, this is a new work flow with a disambiguator, rather than a tagger,
choosing the right word and indirectly deciding part of speech. (Rather
than the opposite).

Or alternatively:

2. Why not collect all possible translation options and evaluate them,
choosing the translation that seems most meaningful or fluent?
(Something like what's done in statistical translation by weighting the
translations by the language model.)
Yes, this is a new "parallel" work flow with several competing
translations, evaluated in the end.

BTW I don't like the idea of using a constraint grammar. I hope
something more automatic could be invented.

Yours,
Per Tunedal


On Sun, May 12, 2013, at 19:46, Mikel Forcada wrote:
> Al 05/12/2013 02:26 PM, En/na Per Tunedal ha escrit:
> >   A
> > better disambiguation would be most helpful. Maybe it would be possible
> > to translate all possible matches, disregarding the part of speech, and
> > later choose the translation that makes most sense/is the most fluent in
> > the target language? Or use a disambiguator instead of the tagger? I
> > will gladly discuss this in a separate thread.
> Are we talking about categorial ambiguity or polysemy here?
> 
> As regards categorial ambiguity, the transfer part of Apertium requires 
> that only one lexical form is present, so the flow you suggest is not 
> currently possible. However, it is possible to train the tagger in 
> different ways (using a hand-tagged corpus, or using the target language 
> [apertium-tagger-training-tools]). In addition to that, there are ways 
> to remove and select lexical forms before the statistical tagger using 
> constraint grammar.
> 
> As regards polysemy (a property of the lemma), Fran is about to release 
> the lexical selection module he has been developing as part of his PhD 
> thesis. One can use it with handwritten rules, or train it using 
> parallel or monolingual corpora.
> 
> Best,
> 
> Mikel
> 
> -- 
> Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
> Departament de Llenguatges i Sistemes Informàtics
> Universitat d'Alacant
> E-03071 Alacant, Spain
> Phone: +34 96 590 9776
> Fax: +34 96 590 9326
> 
> 
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and 
> their applications. This 200-page book is written by three acclaimed 
> leaders in the field. The early access version is available now. 
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

[Apertium-stuff] Choosing the right word was: Re: Status of the pair sv-da

Reply via email to