Marcin, Is there a readable document that shows how prefixes and suffixes are defined and applied in the morfologik speller?
Ruud > Hi all, > > I want to introduce compounding support for MorfologikSpeller for the > next LT release. I looked at hunspell dictionaries again and it seems > that it categorizes words into prefixes, infixes, and suffixes, and > additionally it has flags to designate words that are allowed only in > compounds, as well as a flag to designate an affix allowed anywhere (I > don't have idea what it means practically, COMPOUNDPERMITFLAG). There a > flag to prohibit certain compounds as incorrect. > > There are also problems with the lower and upper case (KEEPCASE flag as > well as CHECKCOMPOUNDCASE). > > I don't exactly understand the parameter COMPOUNDMIN. > > Here's the old idea for us: > > http://wiki.languagetool.org/compounding-support-in-morfologikspeller > > Basically, I think it will be quite easy to parse the hunspell > dictionary to get all the words with compounding flags in all their > forms, so we would be able to convert hunspell dictionaries to FSA > dictionaries with structured tags. But to do so, I need to understand > the semantics of the flags in hunspell dictionaries, and hunspell > documentation is scarce at best. Could anyone please explain it better > to me? Ruud? > > Alternatively, we could leave the speller dictionaries as is and add the > support for compounding directly in LanguageTool by using JWordSplitter > to split words but I'm afraid this won't work so nicely as we don't have > such a library for Dutch, for example. > > Regards, > Marcin > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics > Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel