Per Tunedal <[email protected]> čálii: > Hi again, > Obviously, CG would be quite helpful for disambiguation when doing > lemmatisation. Would it be complicated to add an option to use CG (if > present)? Using the cg-rules for the language would probable remove some > more ambiguity.
Exchange -tagger for -disam to run CG as well. > Looking at the page http://wiki.apertium.org/wiki/Lemmatisation . What > does the command actually do: > > $ echo "Den här är en test." | apertium -d . swe-tagger | cg-proc > guesser.bin | sed 's/<[^>]\+>//g' | cg-proc -n guesser.bin > > Will give lemmatised output where the tokens are encased in ^ and > $, and ambiguous stems/lemmas are given separated by '/' Try it :) -Kevin
signature.asc
Description: PGP signature
------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://makebettercode.com/inteldaal-eval
_______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
