[Apertium-stuff] possible improvements to en-ca

Francis Tyers Mon, 03 Oct 2011 03:23:19 -0700

Hey all,

In preparing my test corpus for experiments with lexical selection, I've
done a ~30,000 word evaluation of Catalan->English, and have come up
with the following list of observations:


https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-en-ca/dev/observations.ca-en.txt

They include multiwords, transfer rules, missing morphology, and lexical
rules. But mainly multiwords. 

I don't have time to make the changes, but maybe someone on the list is
interested. The current error rate is around 45% according to my
calculations -- but the texts haven't been properly checked yet.

Fran

PS. I have started to write a page for discussion of the lexical
selection module here:

http://wiki.apertium.org/wiki/Constraint-based_lexical_selection_module

I would appreciate input on the talk page. 


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

[Apertium-stuff] possible improvements to en-ca

Reply via email to