Hi Francis,
lemmatisation would be interesting to try, but what about
disambiguation?

"ambiguous stems/lemmas are given separated by '/' "

Can this be improved by your new lexical selection module somehow? It
would be better to choose the most probable lemma than simply the first.

And OOW-words (not found in the dictionary, but present in the corpus)?
How to handle them? Can the lemmas be guessed? I suppose some
statistical model might do the trick.

Or maybe the dictionary can be used in some inventive way? It contains a
lot of paradigms - but unfortunately nothing about how common they are.
What about sorting them according to frequency in a reference corpus? Or
adding the frequency with a tag in the paradigms?  (Might be useful
anyway, e.g. when adding words to the monodix: a GUI could propose the
most likely paradigms at the top of an arrow list. Might minimise the
risk for choosing a rare and probably wrong paradigm.)

Yours,
Per Tunedal


On Tue, Mar 1, 2016, at 23:27, Francis Tyers wrote:

--snip--

> If you'd like to share any of your probabilistic lexicons for 
> Swedish--Norwegian
> or Swedish--Danish we'd be interested in looking at them.
> 
> If you have experience in SMT, the word alignments for Europarl for 
> Swedish--Danish
> could be pretty useful! Especially if you use the lemmatisation step 
> described here:
> 
> http://wiki.apertium.org/wiki/Lemmatisation
> 
> Fran
> 
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to