Greetings Apertiumers!

I have two updates to report:

First, I have rewritten the postgenerator (again), this time as part
of apertium-separable (and so not breaking the old one, unlike last
time), and in such a way that postgenerator rules can both match on
lemma and tags in addition to surface forms and iteratively apply to
their own output.

This is available as part of apertium-separable 0.7.0 and is
documented at https://wiki.apertium.org/wiki/Postgenerator

Second, I just added a pair of modules which move capitalization
information into word-bound blanks at the beginning of the pipeline
and then reapply them according to LRX-like rules at the end of the
pipeline, allowing all intermediate modules to operate solely on
dictionary case.

This should be available after the next nightly build (i.e. tomorrow)
in apertium 3.9.0, and is documented at
https://wiki.apertium.org/wiki/Capitalization_restoration

If anyone has questions or would like help trying this out for a
language pair or if I missed something in the documentation, let me
know.

Thanks to Kevin Unhammer and Marc Riera for helping me figure out what
the design of the capitalization module should be.

Merry Christmas,
Daniel

P.S. To anyone not interested in either of these developments: your
Christmas gift is that I accidentally made lexical selection quite a
bit faster while I was working on these.


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to