Re: [lingu-dev] How to get list of valid word in hunspell

Németh László Tue, 03 Mar 2009 08:59:31 -0800

Hi,

I have made a shell script called unmunch.sh. It supports several
Hunspell features: Unicode encoding, different flag types and double
suffixes (so it can process the output of the doubleaffixcompress
script):


http://downloads.sourceforge.net/hunspell/unmunch.sh

(Also I have updated the doubleaffixcompress script:
http://downloads.sourceforge.net/hunspell/doubleaffixcompress).

Unfortunately, compound words, special options are not supported by unmunch.sh.

2. ICONV feature is for general input encoding, so you can use it for
normalization:

ICONV 2
ICONV ọ́ ọ́
ICONV ọ́ ọ́

(Check the correct encoding with GNU recode:
$ cat your_aff | recode u8..h4
ICONV 2
ICONV o&#803;&#769; &oacute;&#803;
ICONV &#7885;&#769; &oacute;&#803;)

Regards,
László


2009/3/2 Sunday Bolaji <[email protected]>:
> Hi,
>     Please is there any way or command that can be used to get list of all 
> valid words in Hunspell library, both the ones in the dictionary file and the 
> ones generated using affix rule.
>   Secondly, is there any way to let hunspell know that two the same combined 
> character write in different way are the same.Example is  the character " ọ́ 
> " can be written by first write " o " and add under dot and tone mark or 
> first write " ọ " and add tone mark or first write " ó " and add under dot to 
> it.
>
> Regards,
> Jeje

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [lingu-dev] How to get list of valid word in hunspell

Reply via email to