Hi again,
I've found that the solution you suggested doesn't work properly. Some 
non-existent words are produced in the process and are kept throughout the 
filtering. This got worse when I tried to get adjectives. The list was full of 
strange words, as well as words of other kinds, like e.g. verbs.

I suspected the expansion produces some output that pollutes the result. Thus I 
tried working directly on apertium-swe-swe.dix, like this:

grep "lm=" apertium-swe.swe.dix | grep "__n_" | less

This produced a usable list of nouns. A side effect is that this is far faster.

Remember, I asked about some very strange Swedish "nouns":
arna
arnas
arnas-
ars
ars-
I have so far not been able to find out where they come from. They are not 
listed as nouns in apertium-swe.swe.dix

Among the adjectives I got e.g. the following verbs:
abbreviera
abdikera
abonnera
abortera

I used:
lt-expand apertium-swe.swe.dix | grep -E "[^<:>]+:[^<:>]+<adj>" | sed -E 
's/[^<:>]+:([^<:>]+).*/\1/g' | sed 's/[¹²³]//g'

Any one who has a clue?

Yours,
Per Tunedal

On Tue, Apr 28, 2020, at 18:36, Samuel Sloniker wrote:
> egrep and fgrep are deprecated. Use grep -E and grep -F .
> 
> On Tue, Apr 28, 2020 at 7:56 AM Per Tunedal <per.tune...@operamail.com> wrote:
>> Hi,
>>  thank you all for your kind help. I'm getting the lists I need.
>>  Yours
>>  Per Tunedal
>> 
>>  On Mon, Apr 27, 2020, at 20:35, Bernard Chardonneau wrote:
>>  > Yes, me I rather do that instead of
>>  > 
>>  > (<vblex>|<vbmod>|<vbser>|<vbhaver>)
>>  > 
>>  > and I also use fgrep and egrep instead of grep -F and grep -E
>>  > as it was/(is ?) in UNIX.
>>  > 
>>  > 
>>  > > Date: Sun, 26 Apr 2020 10:40:39 -0700
>>  > > From: Samuel Sloniker <scoopgra...@gmail.com>
>>  > > To: apertium-stuff@lists.sourceforge.net
>>  > > Reply-To: apertium-stuff@lists.sourceforge.net
>>  > > Subject: Re: [Apertium-stuff] List of verbs
>>  > > Pièce(s) jointes(s) probable(s)>
>>  > >
>>  > > Shouldn't <vb(lex|mod|ser|haver)> also work?
>>  > >
>>  > > On Fri, Apr 24, 2020 at 7:25 AM Daniel Swanson 
>> <awesomeevildu...@gmail.com>
>>  > > wrote:
>>  > >
>>  > > > Also, to explain the patterns
>>  > > >
>>  > > > [^<:>]+ is "match any string of characters that doesn't contain a tag 
>> or a
>>  > > > colon"
>>  > > >
>>  > > > So the grep is "anything without tags or colons (i.e. a surface form) 
>> then
>>  > > > a colon then another string (a lemma) then a <n> tag"
>>  > > >
>>  > > > The sed matches roughly the same thing except it has () around the 
>> lemma
>>  > > > so it can refer to it later and .* to match whatever tags there may 
>> be. \1
>>  > > > then replaces the line with the contents of the first (), i.e. the 
>> lemma.
>>  > > >
>>  > 
>>  > --------------------------------
>>  > Bernard Chardonneau (France)
>>  > Phone : [33] 9 72 36 32 90
>>  > GSM phone : [33] 7 69 46 16 31
>>  > 
>>  > An alternative Apertium translation website :
>>  > http://apertiumtrad.tuxfamily.org
>>  > 
>>  > Multilingual websites for my free softwares :
>>  > http://libremail.free.fr and http://libremail.tuxfamily.org
>>  > http://cyloop.tuxfamily.org (mainly translated with Apertium)
>>  > 
>>  > My general website (in french only)
>>  > http://bech.free.fr
>>  > 
>>  > 
>>  > _______________________________________________
>>  > Apertium-stuff mailing list
>>  > Apertium-stuff@lists.sourceforge.net
>>  > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>  >
>> 
>> 
>>  _______________________________________________
>>  Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to