Hi, I have made a shell script called unmunch.sh. It supports several Hunspell features: Unicode encoding, different flag types and double suffixes (so it can process the output of the doubleaffixcompress script):
http://downloads.sourceforge.net/hunspell/unmunch.sh (Also I have updated the doubleaffixcompress script: http://downloads.sourceforge.net/hunspell/doubleaffixcompress). Unfortunately, compound words, special options are not supported by unmunch.sh. 2. ICONV feature is for general input encoding, so you can use it for normalization: ICONV 2 ICONV ọ́ ọ́ ICONV ọ́ ọ́ (Check the correct encoding with GNU recode: $ cat your_aff | recode u8..h4 ICONV 2 ICONV ọ́ ọ́ ICONV ọ́ ọ́) Regards, László 2009/3/2 Sunday Bolaji <[email protected]>: > Hi, > Please is there any way or command that can be used to get list of all > valid words in Hunspell library, both the ones in the dictionary file and the > ones generated using affix rule. > Secondly, is there any way to let hunspell know that two the same combined > character write in different way are the same.Example is the character " ọ́ > " can be written by first write " o " and add under dot and tone mark or > first write " ọ " and add tone mark or first write " ó " and add under dot to > it. > > Regards, > Jeje --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
