Malaga-fi is a Nutch plugin for indexing documents written in Finnish.
Malaga-fi analyses words morphologically, converts them to a base form (that you find in dictionaries) and indexes the base forms, so that you find all inflections of a word by just searching for the base form. To use an English example, if you search for the word "give" you find all documents that have "give", "gives", "gave", "given", or "giving". This is very important in Finnish since Finnish words have literally tens of thousands of inflected forms. What you need: 1. Malaga programming language. http://home.arcor.de/bjoern-beutel/malaga/ 2. Suomimalaga - Description of Finnish morphology written in Malaga. http://sourceforge.net/project/showfiles.php?group_id=156731 Newest version: svn co https://voikko.svn.sourceforge.net/svnroot/voikko/trunk/suomimalaga 3. JNA library - Simplified native library access for Java. https://jna.dev.java.net/ 4. Malaga-fi - Nutch plugin for documents written in Finnish. http://sourceforge.net/projects/malaga-fi/ 5. Nutch: http://lucene.apache.org/nutch/ Malaga-fi is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.