About question #3: I figured it out if anyone has any use for it: The star (*) is regex for "0 to any amount of the previous character". 9* matches <blank>, 9, 99, 999, ..., 99999, etc. The wordlist is hunspell/tests/compoundrule4.dic which contains cardinal numbers (0-9) and ordinal numbers (0th, 1st, 1th, ..., 8th, 9th). Each number has one or more flags according to type and suffix. Ordinals ending in "th" are flagged /t , other flags are /n, /p, /m and /c.
Rule 1 (n*1t): 0-9 in any combination (n*) + 1 + an ordinal number with suffix for numbers ending in 1X (10th-19th, 2012th). Rule 2 (n*mp): 0-9 in any combination (n*) + 0-9 (m) + an ordinal number flagged /p (5th, 8th etc). Suffixes 1-9, 21=<, except those that rule #1 suffixes. Good luck! 2012/5/10 Sahand.T <[email protected]> > Thank you Yakov. > > 1.About the TRY attribute. It says that the letters should be in order of > most used characters to least used. This means that > "TRY aeis" tried to replace the character "a" in the words first, and then > "e", "i" etc. So if a person types, 'willang', it will replace "a" before > "i" and possible find "willing", correct? > > Do you happen to know the answers to question #2 and #3 too? > > /Sahand > > > 2012/5/10 Yakov Reztsov <[email protected]> > >> >> > Hello, I just joined this mailing list for the purpose of understanding >> > Hunspell better. I am trying to create a spell checker for central >> > kurdish/sorani and am currently looking through examples and playing >> with >> > the .aff file. >> > >> > I don't really know how mailing lists works but if anyone has answers to >> > these things I'd appreciate it (follow up questions may arise). >> > >> > 1.What does the TRY attribute actually do? I found the manuals >> cryptical in >> > their explanation. I understand that it is used to determine wrong >> > characters in words, I don't get how it does it though or how I should >> set >> > it up for my needs. >> > >> TRY attribute use for generate suggestions. It is not apply for correct >> word. >> Dictionary will work without this attribute. >> >> >> > 2.Taken from manual4: >> > >> > *"Personal dictionaries are simple word lists. Asterisk at the first >> > > character position signs prohibition. A second word separated by a >> slash >> > > sets the affixation. >> > > ** >> > > **foo >> > > **Foo/Simpson >> > > ***bar >> > > ** >> > > **In this example, "foo" and "Foo" are personal words, plus Foo will >> be >> > > recognized with affixes of Simpson (Foo’s etc.) and bar is a forbidden >> > > word."* >> > >> > >> > >> > What does the "affixes of Simpson" mean? Is Simpson a flag/class in the >> > .aff file or what? Or does it mean "FooSimpson" will be allowed? >> > >> > 3. What does this compoundrule from an en_US.aff mean and how does it >> make >> > the rules for adding "st", "th", "nd", "rd" to numbers properly? >> > >> > *# ordinal numbers >> > > **COMPOUNDMIN 1 >> > > **# only in compounds: 1th, 2th, 3th >> > > **ONLYINCOMPOUND c >> > > **# compound rules: >> > > **# 1. [0-9]*1[0-9]th (10th, 11th, 12th, 56714th, etc.) >> > > **# 2. [0-9]*[02-9](1st|2nd|3rd|[4-9]th) (21st, 22nd, 123rd, 1234th, >> etc.) >> > > **COMPOUNDRULE 2 >> > > **COMPOUNDRULE n*1t >> > > **COMPOUNDRULE n*mp >> > > **WORDCHARS 0123456789 * >> > >> > >> > 4. When I've created all the rules and a dictionary. Do I then use >> Hunspell >> > to generate better .dic/.aff files? If so, how are they better? (words >> with >> > prefixes are removed?) >> > >> >> No. Task is completed. >> But you can make affix and dict file from list of words with script. >> >> http://hunspell.cvs.sourceforge.net/viewvc/hunspell/hunspell/src/tools/affixcompress?revision=1.1.1.1 >> Files generated by hand, will work better. >> >> >> > > What else do you need the hunspell source and executables for? Is it for >> > the testing features or is there something I've missed that is awesome >> > about having the Hunspell source? >> > >> > >> >> For testing only. >> >> >> >> -- >> Yakov > > >
