God aftan,

New versions of the four Scandinavian pairs are now available from
SourceForge, Github and apertium.org.

These releases come courtesy of Nynorsk pressekontor / NPK (an enclave
of Nynorsk journalists working within NTB, the Norwegian News
Agency[1]), with funding from the Norwegian Ministry of Culture. There
has been some press about the project.[2][3]

NPK have been using apertium-nno-nob in production since fall 2018 –
it's integrated into their translation/editing systems – and we've been
continually improving it with the help of their post-edits and
feedback. The form/spelling/style choices used by nob→nno are now more
modern and uniform (there was a major release of Nynorsk[4] back in
2012, while most style decisions in the translator were made in the
first release back in 2009).

Other major changes to nno-nob:
- 35 new transfer rules[5]
- 248 new lrx rules
- about 42.000 new names and 3.800 new non-names added to bidix
- regression testing by checking that WER does not drop
- lots of work on nob disambiguation
- we now do long-distance adjective congruence
- there's a post-nno.dix to get rid of triple consonants resulting from
  compounding
- compounding happens on proper nouns too now
- genitives are translated not just by preposition-rewriting, but we now
  also have:
  - lists of exceptions where we want to keep genitives
  - rewriting some nouns with relatives
  - rewriting nationalities with adjectives
  - rewriting some abstract nouns into compounds

The project is not yet done, but people have been asking about when the
fruits of it will show up on apertium.org :-)

The other three pairs have also had improvements since last release;
some were also getting pretty bad testvoc-issues due to changes in
dependencies[6], so they get releases too. Apart from testvoc, the pairs
have gotten some transfer rules and fixes merged in from nno-nob
(e.g. prop compounding, and handling genitives in coordinated NP's), and
various disambiguation and vocabulary updates.


-Kevin


[1] https://en.wikipedia.org/wiki/Norwegian_News_Agency
[2] 
https://www.medier24.no/artikler/na-blir-det-nynorsk-bonanza-i-ntb-splitter-ny-robot-oversetter-artikler-automatisk-fra-bokmal/440934
[3] 
https://framtida.no/2018/08/08/nynorskrobot-ei-god-loysing-for-a-dekke-nynorskprosenten
[4] http://www.sprakradet.no/upload/Brosjyrer/Ny%20nynorskrettskriving.pdf
[5] One of which required a bugfix to apertium-transfer
    
https://github.com/apertium/apertium/commit/542de014a93c96905198f193e0a62a89317fa8a9
[6] https://github.com/apertium/apertium-packaging/issues/12




_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to