Hi Francis, lemmatisation would be interesting to try, but what about disambiguation?
"ambiguous stems/lemmas are given separated by '/' " Can this be improved by your new lexical selection module somehow? It would be better to choose the most probable lemma than simply the first. And OOW-words (not found in the dictionary, but present in the corpus)? How to handle them? Can the lemmas be guessed? I suppose some statistical model might do the trick. Or maybe the dictionary can be used in some inventive way? It contains a lot of paradigms - but unfortunately nothing about how common they are. What about sorting them according to frequency in a reference corpus? Or adding the frequency with a tag in the paradigms? (Might be useful anyway, e.g. when adding words to the monodix: a GUI could propose the most likely paradigms at the top of an arrow list. Might minimise the risk for choosing a rare and probably wrong paradigm.) Yours, Per Tunedal On Tue, Mar 1, 2016, at 23:27, Francis Tyers wrote: --snip-- > If you'd like to share any of your probabilistic lexicons for > Swedish--Norwegian > or Swedish--Danish we'd be interested in looking at them. > > If you have experience in SMT, the word alignments for Europarl for > Swedish--Danish > could be pretty useful! Especially if you use the lemmatisation step > described here: > > http://wiki.apertium.org/wiki/Lemmatisation > > Fran > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
