Hi Mikel, yes, your perfectly right. I didn't bother to look for the right word when replying: a fatal error in a discussion regarding the above subject :-)
I apologize for my negligence. Maybe 'ambiguity', would have been a better word, including all of your examples. My point is that it's important to choose the right word! Yours, Per Tunedal On Sun, May 12, 2013, at 21:43, Mikel Forcada wrote: > Per: > > Hi, > > I regard categorical ambiguity (part of speech ambiguity) as a special > > case of polysemi. What's important, is translating to a word with the > > right meaning. > I think confusing categori[c]al ambiguity with polysemy is not a good > idea, and a source of problems. I agree they have a similar result > (different translations) but they are treated differently, because they > have different nature. I've been around building rule-based machine > translation systems for almost 15 years, and this was clear to me from > the outset. > > Polysemy is a property of the lemma of a word, and it is shared by all > its inflected forms: _station_ is as polysemic as _stations_, because > they have the same lemma _station_. The change in meaning has no > syntactical effect: "I love this station" could be "I love this train > station" or "I love this radio station" > > Categorial ambiguity is a property of a particular surface form (e.g. > "books") affects syntax in "He books a room", "books" can only be a noun. > > There is a third case of ambiguity, that occurs when a surface form has > more than one lexical form, but all have the same category. For > instance, in Spanish, "creo" may be "I believe" or "I create": same > category, same tense, same person and number. > > In my teaching I like to call the two last ambiguities "homography". > > > > > > > > Yes, I intend to try Fran's new lexical selection module. But I was just > > thinking that the current work flow is a bit odd: > > > > 1. Wouldn't it be more adequate to begin with finding the right word, > Unless you define what you call "finding the right word", there is not > much I can do to help. > > rather than trying to fix it afterwards with a lexical selection module? > > Yes, this is a new work flow with a disambiguator, rather than a tagger, > > choosing the right word and indirectly deciding part of speech. (Rather > > than the opposite). > Define "choosing the right word" and how you intend to do that. > > > > Or alternatively: > > > > 2. Why not collect all possible translation options and evaluate them, > > choosing the translation that seems most meaningful or fluent? > Felipe Sánchez-Martínez's PhD thesis, of which I was co-advisor, studied > how HMM-based part-of-speech taggers could be trained using a > target-language model. And as a point of comparison, he used a system > that chose the best disambiguation of each sentence using a statistical > target language model. In the languages he studied (all European) he > found ambiguity rates of 1.3 lexical forms per surface form. This, for > your typical 20-word sentence, means that you have to consider 1.3^20 > =190 readings. He had to do this during training. He devised a way to > choose the winner at translation time before having to score all > possible readings, and got quite far. > > BTW, one interesting result was that his tagging accuracy was not as > good as that of a tagger trained on hand-tagged text, but the > translation error rate was almost as good. Tagging was just a way to get > the best translation. > > (Something like what's done in statistical translation by weighting the > > translations by the language model.) > Yes, see above. > > Yes, this is a new "parallel" work flow with several competing > > translations, evaluated in the end. > Slow as molasses. Ask Felipe. > > > > BTW I don't like the idea of using a constraint grammar. I hope > > something more automatic could be invented. > I will comment on that in a minute. > > Mikel > > -- > Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/) > Departament de Llenguatges i Sistemes Informàtics > Universitat d'Alacant > E-03071 Alacant, Spain > Phone: +34 96 590 9776 > Fax: +34 96 590 9326 > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and > their applications. This 200-page book is written by three acclaimed > leaders in the field. The early access version is available now. > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
