Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Per Tunedal
Hi, Thank you Kevin! Works like a charm. BTW I've already changed 'unique' to 'sort -u' Yours, Per On Thu, Apr 23, 2020, at 10:42, Kevin Brubeck Unhammer wrote: > "Per Tunedal" > čálii: > > > Hi Kevin, > > thanks for the explanation. Thus they are homonyms. How do I get rid of the > >

Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Per Tunedal
Hi Kevin, thanks for the explanation. Thus they are homonyms. How do I get rid of the duplicates? I just want: tur Yours, Per Tunedal On Thu, Apr 23, 2020, at 10:00, Kevin Brubeck Unhammer wrote: > "Per Tunedal" > čálii: > > > Hi Daniel, > > Thank you! Works like a charm with a small

Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Per Tunedal
Hi Daniel, Thank you! Works like a charm with a small exception. I get some strange duplicates like e.g. tur: tur¹ tur² Yours, Per Tunedal On Wed, Apr 22, 2020, at 16:28, Daniel Swanson wrote: > Hi Per, > > If I understand correctly, this might give what you want: > > lt-expand

Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Tanmai Khanna
Hi, How about you try this: lt-expand apertium-swe.swe.dix | grep -E "[^<:>]+:[^<:>]+" | sed -E 's/[^<:>]+:([^<:>]+).*/\1/g' | sed -E 's/\p{No}//g' | uniq Just a small addition to Daniel's earlier command, to delete all superscripts before removing duplicates. Hopefully you don't need

Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Kevin Brubeck Unhammer
"Per Tunedal" čálii: > Hi Kevin, > thanks for the explanation. Thus they are homonyms. How do I get rid of the > duplicates? > I just want: > > tur before the `| uniq`, stick in | sed 's/[¹²³]//g' (You may have to change `uniq` to `sort -u` in case things are not ordered already)

Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Kevin Brubeck Unhammer
"Per Tunedal" čálii: > Hi Daniel, > Thank you! Works like a charm with a small exception. > > I get some strange duplicates like e.g. tur: > > tur¹ > tur² slump vs färd, they have different paradigms: turtur¹ turtur² signature.asc Description: PGP signature