Re: [lingu-dev] Hunspell morphological analysis and grammar checker

Marcin Miłkowski Sun, 25 May 2008 12:38:42 -0700

Michel Weimerskirch pisze:

Thank you Marcin and Janis for your comments.


I have been rebuilding the wordlist for Luxembourgish from scratch for
the last few months. It will be released in a few weeks. Most of the
words are arranged in separate lists like "adjectives", "nouns", etc.
and the affix rules have been created accordingly.

Thus, adding PoS information *should* be straightforward and it would
save the effort of managing PoS data in another location. I haven't
tried it yet though, because I'm still trying to figure out the best
way to do this.

If you have a good affix file, then give it a try. Note, also, thatPolish is quite an extreme case - this is a Slavonic language, so wehave lots of conjugations, declinations and exceptions - nobody reallyknows how many, and schoolbooks lie saying it's 5 or something like that :D

I could try to write a prototype that converts the morphological data
generated by hunspell into a source file for this morph_fsa tool (have
to look into this one too). Maybe something like this:
- "unmunch" my wordlist to get a list of all "possible" words
- generate morphological data using the "analyze" tool
- apply a little awk/sed magic ;-)

This looks like a good idea. All you need for morph_fsa is a tabbedfile, with three tab-separated fields: a base form, an inflected form,and a POS tag.

If you want to make a grammar checker for Luxembourgish usingLanguageTool as your framework, I can help you with first steps fordoing this.


Regards,
Marcin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [lingu-dev] Hunspell morphological analysis and grammar checker

Reply via email to