A (difficult) example for the flag long could be the Dutch file, by OpenTaal (largely my work):
http://www.opentaal.org/bestanden/doc_download/19-woordenlijst-v-210g-voor-mozilla-producten Ruud > In the AFFIX file, default flag is just 1 char. > When the clause FLAG num is in the file, the flags are numbers in the > 2-byte range, from 1 upt to 65535, separated by a comma (1,2,3,4,555) > > When the clause FLAG long is in the file, the flags are two chars long, > which also translates into 2 bytes internatlly (just plains ascii chars > allowed), but there is no separation. > > Flags might be like > Word/AaBbCcDD , actually the flags Aa and Bb etc. > > So I guess it might be best to use a double byte internally, translating > all flags to that. > > Then there are multiple compounding methods. > > One using COMPOUNDRULE (mostly for numbers etc, very systematic patterns) > > COMPOUNDRULE 39 > COMPOUNDRULE (N1)(n2) # eenen+zestig[ste] > COMPOUNDRULE (G1)*(Le) # 1e - 9999e > ... > > # general compounding, for normal compounding mechanisms. Beware: prefixes > only apply to 'first' and suffices to 'last' > COMPOUNDBEGIN Ca > COMPOUNDMIDDLE Cb > COMPOUNDEND Cc > COMPOUNDPERMITFLAG Cp > ONLYINCOMPOUND Cx > > > The other mechanism is using continuations flags: > SFX CA Y 2 > SFX CA 0 /CaCp > SFX CA 0 -/CaCp > > (Flag CA creates the option to add words with flag Ca, eiterh wth a - in > front of it, or not) > > You could reduce some ot the complexity by ignoring all 'filters, like > 'checkcompoundpattern, onlyincompound etc', because after generating the > huge outcom, you could apply Hunspell with the same dic and affixe with > the command line option -G te get the correct words from the entire list. > > The hunspell used had better be > 1.3 then, since 1.2* has a bug, > suggestion mistaken compound words. > > Does this help? > > Ruud > > > > >> Hello, >> >> I didn't go to work today because I was not feeling well. >> >> I have decided to dedicate some time today to improve PTG in order for >> it also to unmunch .DICs with numbers instead of characters. >> >> Daniel and Ruud, could you explain to me in detail how to detect if the >> .AFF deals with chrs or numbers? >> >> Also, could you provide the troubling dictionary for me to analyse and >> test? >> >> I have been swamped with work and only in January I will be on vacation, >> but I will try to do my best! >> >> Thanks! >> >> Kind regards from your friend, >> >Marco A.G.Pinto >> ---------------------- >> >> >> -- >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push >> notifications. >> Take corrective actions from your mobile device. >> http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk_______________________________________________ >> Languagetool-devel mailing list >> Languagetool-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >> > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel