https://issues.apache.org/jira/browse/SOLR-2237
On Mon, Nov 15, 2010 at 5:04 AM, Jakub Godawa <jakub.god...@gmail.com> wrote: > I tried to reach the autors twice, but with no luck. I've seen some > posts where people finally were able to lunch it (without much pain). > I don't know. If any pro would be so nice to try to run the stempel on > his/her machine and paste me some verbose step by step solution I > would really appreciate. > > Cheers, > Jakub Godawa. > > 2010/11/13 Lance Norskog <goks...@gmail.com>: >> I don't know of the Stempel jar includes the Java source. At this point I >> think you should ask the author to Stempel to make a Solr front-end for it. >> It's very simple for him. >> >> Jakub Godawa wrote: >>> >>> Am I not doing it in the point no 4? I am compiling all the folder >>> that was extracted before, but now with that new class file. >>> >>> 2010/11/12 Lance Norskog<goks...@gmail.com>: >>> >>>> >>>> I think you have to compile all of the stempel source including your >>>> filter factory into one jar at the same time. Everybody does this; I >>>> don't know how different Java versions make class file binaries. >>>> >>>> On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawa<jakub.god...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> Hi! Sorry for such a break, but I was moving house... anyway: >>>>> >>>>> 1. I took the >>>>> ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java >>>>> file and modified it (named as StempelFilterFactory.java) in Vim that >>>>> way: >>>>> >>>>> package org.getopt.solr.analysis; >>>>> >>>>> import org.apache.lucene.analysis.TokenStream; >>>>> import org.apache.lucene.analysis.standard.StandardFilter; >>>>> >>>>> public class StempelTokenFilterFactory extends BaseTokenFilterFactory { >>>>> public StempelFilter create(TokenStream input) { >>>>> return new StempelFilter(input); >>>>> } >>>>> } >>>>> >>>>> 2. Then I put the file to the extracted stempel-1.0.jar in >>>>> ./org/getopt/solr/analysis/ >>>>> 3. Then I created a class from it: jar -cf >>>>> StempelTokenFilterFactory.class StempelFilterFactory.java >>>>> 4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar >>>>> -C ./stempel-1.0/ . >>>>> 5. Then in schema.xml I've put: >>>>> >>>>> <fieldType name="text_pl" class="solr.TextField"> >>>>> <analyzer> >>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>> <filter >>>>> class="org.getopt.solr.analysis.StempelTokenFilterFactory" /> >>>>> </analyzer> >>>>> </fieldType> >>>>> >>>>> 6. I started the solr server and I recieved the following error: >>>>> >>>>> 2010-11-11 11:50:56 org.apache.solr.common.SolrException log >>>>> SEVERE: java.lang.ClassFormatError: Incompatible magic value >>>>> 1347093252 in class file >>>>> org/getopt/solr/analysis/StempelTokenFilterFactory >>>>> at java.lang.ClassLoader.defineClass1(Native Method) >>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:634) >>>>> at >>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>>>> ... >>>>> >>>>> Question: What is wrong? :) I use "jar (fastjar) 0.98" to create jars, >>>>> I googled on that error but with no answer gave me idea what is wrong >>>>> in my .java file. >>>>> >>>>> Please help, as I believe I am close to the end of that subject. >>>>> >>>>> Cheers, >>>>> Jakub Godawa. >>>>> >>>>> 2010/11/3 Lance Norskog<goks...@gmail.com>: >>>>> >>>>>> >>>>>> Here's the problem: Solr is a little dumb about these Filter classes, >>>>>> and so you have to make a Factory object for the Stempel Filter. >>>>>> >>>>>> There are a lot of other FilterFactory classes. You would have to just >>>>>> copy one and change the names to Stempel and it might actually work. >>>>>> >>>>>> This will take some Solr programming- perhaps the author can help you? >>>>>> >>>>>> On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa<jakub.god...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> Sorry, I am not Java programmer at all. I would appreciate more >>>>>>> verbose (or step by step) help. >>>>>>> >>>>>>> 2010/11/2 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>: >>>>>>> >>>>>>>> >>>>>>>> So you call org.getopt.solr.analysis.StempelTokenFilterFactory. >>>>>>>> In this case I would assume a file StempelTokenFilterFactory.class >>>>>>>> in your directory org/getopt/solr/analysis/. >>>>>>>> >>>>>>>> And a class which extends the BaseTokenFilterFactory rigth? >>>>>>>> ... >>>>>>>> public class StempelTokenFilterFactory extends BaseTokenFilterFactory >>>>>>>> implements ResourceLoaderAware { >>>>>>>> ... >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Am 02.11.2010 14:20, schrieb Jakub Godawa: >>>>>>>> >>>>>>>>> >>>>>>>>> This is what stempel-1.0.jar consist of after jar -xf: >>>>>>>>> >>>>>>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/ >>>>>>>>> org/: >>>>>>>>> egothor getopt >>>>>>>>> >>>>>>>>> org/egothor: >>>>>>>>> stemmer >>>>>>>>> >>>>>>>>> org/egothor/stemmer: >>>>>>>>> Cell.class Diff.class Gener.class MultiTrie2.class >>>>>>>>> Optimizer2.class Reduce.class Row.class TestAll.class >>>>>>>>> TestLoad.class Trie$StrEnum.class >>>>>>>>> Compile.class DiffIt.class Lift.class MultiTrie.class >>>>>>>>> Optimizer.class Reduce$Remap.class Stock.class Test.class >>>>>>>>> Trie.class >>>>>>>>> >>>>>>>>> org/getopt: >>>>>>>>> stempel >>>>>>>>> >>>>>>>>> org/getopt/stempel: >>>>>>>>> Benchmark.class lucene Stemmer.class >>>>>>>>> >>>>>>>>> org/getopt/stempel/lucene: >>>>>>>>> StempelAnalyzer.class StempelFilter.class >>>>>>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/ >>>>>>>>> META-INF/: >>>>>>>>> MANIFEST.MF >>>>>>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res >>>>>>>>> res: >>>>>>>>> tables >>>>>>>>> >>>>>>>>> res/tables: >>>>>>>>> readme.txt stemmer_1000.out stemmer_100.out stemmer_2000.out >>>>>>>>> stemmer_200.out stemmer_500.out stemmer_700.out >>>>>>>>> >>>>>>>>> 2010/11/2 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Jakub, >>>>>>>>>> >>>>>>>>>> if you unzip your stempel-1.0.jar do you have the >>>>>>>>>> required directory structure and file in there? >>>>>>>>>> org/getopt/stempel/lucene/StempelFilter.class >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Bernd >>>>>>>>>> >>>>>>>>>> Am 02.11.2010 13:54, schrieb Jakub Godawa: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Erick I've put the jar files like that before. I also added the >>>>>>>>>>> directive and put the file in instanceDir/lib >>>>>>>>>>> >>>>>>>>>>> What is still a problem is that even the files are loaded: >>>>>>>>>>> 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader >>>>>>>>>>> replaceClassLoader >>>>>>>>>>> INFO: Adding >>>>>>>>>>> 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar' >>>>>>>>>>> to classloader >>>>>>>>>>> >>>>>>>>>>> I am not able to use the FilterFactory... maybe I am attempting it >>>>>>>>>>> in >>>>>>>>>>> a wrong way? >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Jakub Godawa. >>>>>>>>>>> >>>>>>>>>>> 2010/11/2 Erick Erickson<erickerick...@gmail.com>: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The polish stemmer jar file needs to be findable by Solr, if you >>>>>>>>>>>> copy >>>>>>>>>>>> it to<solr_home>/lib and restart solr you should be set. >>>>>>>>>>>> >>>>>>>>>>>> Alternatively, you can add another<lib> directive to the >>>>>>>>>>>> solrconfig.xml >>>>>>>>>>>> file >>>>>>>>>>>> (there are several examples in that file already). >>>>>>>>>>>> >>>>>>>>>>>> I'm a little confused about not being able to find TokenFilter, >>>>>>>>>>>> is that >>>>>>>>>>>> still >>>>>>>>>>>> a problem? >>>>>>>>>>>> >>>>>>>>>>>> HTH >>>>>>>>>>>> Erick >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Nov 2, 2010 at 8:07 AM, Jakub >>>>>>>>>>>> Godawa<jakub.god...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you Bernd! I couldn't make it run though. Here is my >>>>>>>>>>>>> problem: >>>>>>>>>>>>> >>>>>>>>>>>>> 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar >>>>>>>>>>>>> 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is >>>>>>>>>>>>> a >>>>>>>>>>>>> directive:<lib path="../lib/stempel-1.0.jar" /> >>>>>>>>>>>>> 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is >>>>>>>>>>>>> fieldType: >>>>>>>>>>>>> >>>>>>>>>>>>> (...) >>>>>>>>>>>>> <!-- Polish --> >>>>>>>>>>>>> <fieldType name="text_pl" class="solr.TextField"> >>>>>>>>>>>>> <analyzer> >>>>>>>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>>>>>>>>>>>> <filter class="solr.LowerCaseFilterFactory"/> >>>>>>>>>>>>> <filter class="org.getopt.stempel.lucene.StempelFilter" /> >>>>>>>>>>>>> <!--<filter >>>>>>>>>>>>> class="org.getopt.solr.analysis.StempelTokenFilterFactory" >>>>>>>>>>>>> protected="protwords.txt" /> --> >>>>>>>>>>>>> </analyzer> >>>>>>>>>>>>> </fieldType> >>>>>>>>>>>>> (...) >>>>>>>>>>>>> >>>>>>>>>>>>> 4. jar file is loaded but I got an error: >>>>>>>>>>>>> SEVERE: Could not start SOLR. Check solr/home property >>>>>>>>>>>>> java.lang.NoClassDefFoundError: >>>>>>>>>>>>> org/apache/lucene/analysis/TokenFilter >>>>>>>>>>>>> at java.lang.ClassLoader.defineClass1(Native Method) >>>>>>>>>>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:634) >>>>>>>>>>>>> at >>>>>>>>>>>>> >>>>>>>>>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>>>>>>>>>>>> (...) >>>>>>>>>>>>> >>>>>>>>>>>>> 5. Different class gave me that one: >>>>>>>>>>>>> SEVERE: org.apache.solr.common.SolrException: Error loading >>>>>>>>>>>>> class >>>>>>>>>>>>> 'org.getopt.solr.analysis.StempelTokenFilterFactory' >>>>>>>>>>>>> at >>>>>>>>>>>>> >>>>>>>>>>>>> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) >>>>>>>>>>>>> at >>>>>>>>>>>>> >>>>>>>>>>>>> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390) >>>>>>>>>>>>> (...) >>>>>>>>>>>>> >>>>>>>>>>>>> Question is: How to make<fieldType /> and<filter /> work with >>>>>>>>>>>>> that >>>>>>>>>>>>> Stempel? :) >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Jakub Godawa. >>>>>>>>>>>>> >>>>>>>>>>>>> 2010/10/29 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Jakub, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have ported the KStemmer for use in most recent Solr trunk >>>>>>>>>>>>>> version. >>>>>>>>>>>>>> My stemmer is located in the lib directory of Solr >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> "solr/lib/KStemmer-2.00.jar" >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> because it belongs to Solr. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Write it as FilterFactory and use it as Filter like: >>>>>>>>>>>>>> <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> protected="protwords.txt" /> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is how my fieldType looks like: >>>>>>>>>>>>>> >>>>>>>>>>>>>> <fieldType name="text_kstem" class="solr.TextField" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> positionIncrementGap="100"> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <analyzer type="index"> >>>>>>>>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory" /> >>>>>>>>>>>>>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> words="stopwords.txt" enablePositionIncrements="false" /> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <filter class="solr.WordDelimiterFilterFactory" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1" >>>>>>>>>>>>> catenateNumbers="1" >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> catenateAll="0" splitOnCaseChange="1" /> >>>>>>>>>>>>>> <filter class="solr.LowerCaseFilterFactory" /> >>>>>>>>>>>>>> <filter >>>>>>>>>>>>>> class="de.ubbielefeld.solr.analysis.KStemFilterFactory" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> protected="protwords.txt" /> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" >>>>>>>>>>>>>> /> >>>>>>>>>>>>>> </analyzer> >>>>>>>>>>>>>> <analyzer type="query"> >>>>>>>>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory" /> >>>>>>>>>>>>>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> words="stopwords.txt" /> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <filter class="solr.WordDelimiterFilterFactory" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="0" >>>>>>>>>>>>> catenateNumbers="0" >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> catenateAll="0" splitOnCaseChange="1" /> >>>>>>>>>>>>>> <filter class="solr.LowerCaseFilterFactory" /> >>>>>>>>>>>>>> <filter >>>>>>>>>>>>>> class="de.ubbielefeld.solr.analysis.KStemFilterFactory" >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> protected="protwords.txt" /> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" >>>>>>>>>>>>>> /> >>>>>>>>>>>>>> </analyzer> >>>>>>>>>>>>>> </fieldType> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Bernd >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Am 28.10.2010 14:56, schrieb Jakub Godawa: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi! >>>>>>>>>>>>>>> There is a polish stemmer http://www.getopt.org/stempel/ and I >>>>>>>>>>>>>>> have >>>>>>>>>>>>>>> problems connecting it with solr 1.4.1 >>>>>>>>>>>>>>> Questions: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Where EXACTLY do I put "stemper-1.0.jar" file? >>>>>>>>>>>>>>> 2. How do I register the file, so I can build a fieldType >>>>>>>>>>>>>>> like: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> <fieldType name="text_pl" class="solr.TextField"> >>>>>>>>>>>>>>> <analyzer >>>>>>>>>>>>>>> class="org.geoopt.solr.analysis.StempelTokenFilterFactory"/> >>>>>>>>>>>>>>> </fieldType> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 3. Is that the right approach to make it work? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for verbose explanation, >>>>>>>>>>>>>>> Jakub. >>>>>>>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Lance Norskog >>>>>> goks...@gmail.com >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Lance Norskog >>>> goks...@gmail.com >>>> >>>> >> >