Hi Kevin, Thanks for the explanation. But what's the point of expanding the dictionary, anyway?
I successfully tried: grep "lm=" apertium-swe.swe.dix | grep "__n_" grep "lm=" apertium-swe.swe.dix | grep "__vblex" grep "lm=" apertium-swe.swe.dix | grep "__adj" etc Faster and easier. But I didn't get the two nouns "tur", due to the comment in Norwegian. Had to add tur manually. I used: grep "lm=" apertium-swe.swe.dix | grep "__adj" | sed 's/\"><.*//' | sed 's/<e lm=\"//' This mached most lines, as they looked like this: <e lm="vals"><i>vals</i><par n="fax__n_ut"/></e> But it didn't mach these two lines: <e c="flaks" lm="tur"> <p><l>tur</l><r>tur¹</r></p><par n="mjölk__n_ut"/></e> <e c="gåtur" lm="tur"> <p><l>tur</l><r>tur²</r></p><par n="film__n_ut"/></e> I didn't care, as it was just two lines that had comments. Yours, Per Tunedal On Mon, May 11, 2020, at 10:18, Kevin Brubeck Unhammer wrote: > "Per Tunedal" <per.tune...@operamail.com> > čálii: > > [...] > > > arna > > arnas > > arnas- > > ars > > ars- > > I have so far not been able to find out where they come from. They are not > > listed as nouns in apertium-swe.swe.dix > > Probably the sed not being able to hand lines like > > DJ:arna:DJ<n><ut><pl><def> > > You may have to grep out lines with two colons first. > > > Among the adjectives I got e.g. the following verbs: > > abbreviera > > abdikera > > abonnera > > abortera > > Participles of verbs get tagged <adj><pp> / <adj><pprs>. You can grep > them out. > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > > *Attachments:* > * signature.asc
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff