Hi, I've tagged some new releases of nno, nob and apertium-nno-nob.
Like before[0], the work has been funded by the Norwegian Ministry of Culture via Nynorsk pressekontor (NPK) and the Norwegian News Agency, now with direct commits from contributors Anja, Victoria and Hallvard of NPK :-) One major visible change is that we now let the user select a number of spelling variants using a new preferences system. Instead of compiling one FST per set of style choices, we just generate all choices on the fly and disambiguate.[1] You can try it already on the Beta site[2] (though currently it may fail if you use it with Transfuse[3]). Before, the user could only select if infinitives ended in -e or -a; now they can also pick if the third person plural should be "me" or "vi", if words like "byggje" should have the optional j there, etc. They can combine such options as they choose, and we'll be adding more options in the future. Since last time, we've also updated the monolingual dictionaries with new entries from the updated Norsk ordbank[4] and gotten lots of new bidix entry as well through that. Other changes: - 41 new transfer rules - 614 new lrx rules - about 800 new names and 26.800 new non-names added to bidix (many scriptually added via new Norsk ordbank entries) - many transfer tweaks, e.g. adverbs can move past noun phrases, new constructions recognised - lots of work on nob disambiguation, especially on noun vs verb and participles (which gain a distinction in nno which they don't have in nob) - much more consistent default nno spelling choices - rules for name guessing using CG - number compounding + more left-hand-only compound parts WER on news text continues to stay around 4% – we're on the one hand reaching deep into the long tail of unknown words, and on the other hand spending more time making things more idiomatic with multi-word rules. The next steps include better support for correcting capitalisation[5] and starting to use apertium-separable for MWE's.[6] -Kevin [0] https://sourceforge.net/p/apertium/mailman/apertium-stuff/thread/CABnmVq5J5Acc7r4XwtMgVR2eyd5dF2ab4gUsUv2ZWPzMWE5J7A%40mail.gmail.com/ [1] https://wiki.apertium.org/wiki/Dialectal_or_standard_variation#Overlapping_variants [2] https://beta.apertium.org/index.eng.html?dir=nob-nno&q=Vi%20liker%20enten%20%C3%A5%20fortsette%20%C3%A5%20bygge%20n%C3%A5r%20vi%20blant%20annet%20s%C3%B8ker%20forskjellen%20mens%20dere%20er%20uenige.#translation [3] https://github.com/TinoDidriksen/cg3/pull/75 [4] https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-41/ [5] We'd like to be able to state in bidix that "Xyz" should turn into "xyz", which is currently not possible. See also https://github.com/apertium/apertium/issues/75 [6] currently blocked by https://github.com/apertium/apertium-separable/issues/36 _______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff