https://issues.apache.org/jira/browse/SOLR-2237

On Mon, Nov 15, 2010 at 5:04 AM, Jakub Godawa <jakub.god...@gmail.com> wrote:
> I tried to reach the autors twice, but with no luck. I've seen some
> posts where people finally were able to lunch it (without much pain).
> I don't know. If any pro would be so nice to try to run the stempel on
> his/her machine and paste me some verbose step by step solution I
> would really appreciate.
>
> Cheers,
> Jakub Godawa.
>
> 2010/11/13 Lance Norskog <goks...@gmail.com>:
>> I don't know of the Stempel jar includes the Java source. At this point I
>> think you should ask the author to Stempel to make a Solr front-end for it.
>> It's very simple for him.
>>
>> Jakub Godawa wrote:
>>>
>>> Am I not doing it in the point no 4? I am compiling all the folder
>>> that was extracted before, but now with that new class file.
>>>
>>> 2010/11/12 Lance Norskog<goks...@gmail.com>:
>>>
>>>>
>>>> I think you have to compile all of the stempel source including your
>>>> filter factory into one jar at the same time. Everybody does this; I
>>>> don't know how different Java versions make class file binaries.
>>>>
>>>> On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawa<jakub.god...@gmail.com>
>>>>  wrote:
>>>>
>>>>>
>>>>> Hi! Sorry for such a break, but I was moving house... anyway:
>>>>>
>>>>> 1. I took the
>>>>> ~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
>>>>> file and modified it (named as StempelFilterFactory.java) in Vim that
>>>>> way:
>>>>>
>>>>> package org.getopt.solr.analysis;
>>>>>
>>>>> import org.apache.lucene.analysis.TokenStream;
>>>>> import org.apache.lucene.analysis.standard.StandardFilter;
>>>>>
>>>>> public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
>>>>>  public StempelFilter create(TokenStream input) {
>>>>>    return new StempelFilter(input);
>>>>>  }
>>>>> }
>>>>>
>>>>> 2. Then I put the file to the extracted stempel-1.0.jar in
>>>>> ./org/getopt/solr/analysis/
>>>>> 3. Then I created a class from it: jar -cf
>>>>> StempelTokenFilterFactory.class StempelFilterFactory.java
>>>>> 4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
>>>>> -C ./stempel-1.0/ .
>>>>> 5. Then in schema.xml I've put:
>>>>>
>>>>>    <fieldType name="text_pl" class="solr.TextField">
>>>>>      <analyzer>
>>>>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>>        <filter class="solr.LowerCaseFilterFactory"/>
>>>>>        <filter
>>>>> class="org.getopt.solr.analysis.StempelTokenFilterFactory" />
>>>>>      </analyzer>
>>>>>    </fieldType>
>>>>>
>>>>> 6. I started the solr server and I recieved the following error:
>>>>>
>>>>> 2010-11-11 11:50:56 org.apache.solr.common.SolrException log
>>>>> SEVERE: java.lang.ClassFormatError: Incompatible magic value
>>>>> 1347093252 in class file
>>>>> org/getopt/solr/analysis/StempelTokenFilterFactory
>>>>>        at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>        at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
>>>>>        at
>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>>>> ...
>>>>>
>>>>> Question: What is wrong? :) I use "jar (fastjar) 0.98" to create jars,
>>>>> I googled on that error but with no answer gave me idea what is wrong
>>>>> in my .java file.
>>>>>
>>>>> Please help, as I believe I am close to the end of that subject.
>>>>>
>>>>> Cheers,
>>>>> Jakub Godawa.
>>>>>
>>>>> 2010/11/3 Lance Norskog<goks...@gmail.com>:
>>>>>
>>>>>>
>>>>>> Here's the problem: Solr is a little dumb about these Filter classes,
>>>>>> and so you have to make a Factory object for the Stempel Filter.
>>>>>>
>>>>>> There are a lot of other FilterFactory classes. You would have to just
>>>>>> copy one and change the names to Stempel and it might actually work.
>>>>>>
>>>>>> This will take some Solr programming- perhaps the author can help you?
>>>>>>
>>>>>> On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa<jakub.god...@gmail.com>
>>>>>>  wrote:
>>>>>>
>>>>>>>
>>>>>>> Sorry, I am not Java programmer at all. I would appreciate more
>>>>>>> verbose (or step by step) help.
>>>>>>>
>>>>>>> 2010/11/2 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>:
>>>>>>>
>>>>>>>>
>>>>>>>> So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
>>>>>>>> In this case I would assume a file StempelTokenFilterFactory.class
>>>>>>>> in your directory org/getopt/solr/analysis/.
>>>>>>>>
>>>>>>>> And a class which extends the BaseTokenFilterFactory rigth?
>>>>>>>> ...
>>>>>>>> public class StempelTokenFilterFactory extends BaseTokenFilterFactory
>>>>>>>> implements ResourceLoaderAware {
>>>>>>>> ...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 02.11.2010 14:20, schrieb Jakub Godawa:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is what stempel-1.0.jar consist of after jar -xf:
>>>>>>>>>
>>>>>>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
>>>>>>>>> org/:
>>>>>>>>> egothor  getopt
>>>>>>>>>
>>>>>>>>> org/egothor:
>>>>>>>>> stemmer
>>>>>>>>>
>>>>>>>>> org/egothor/stemmer:
>>>>>>>>> Cell.class     Diff.class    Gener.class  MultiTrie2.class
>>>>>>>>> Optimizer2.class  Reduce.class        Row.class    TestAll.class
>>>>>>>>> TestLoad.class  Trie$StrEnum.class
>>>>>>>>> Compile.class  DiffIt.class  Lift.class   MultiTrie.class
>>>>>>>>> Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
>>>>>>>>> Trie.class
>>>>>>>>>
>>>>>>>>> org/getopt:
>>>>>>>>> stempel
>>>>>>>>>
>>>>>>>>> org/getopt/stempel:
>>>>>>>>> Benchmark.class  lucene  Stemmer.class
>>>>>>>>>
>>>>>>>>> org/getopt/stempel/lucene:
>>>>>>>>> StempelAnalyzer.class  StempelFilter.class
>>>>>>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
>>>>>>>>> META-INF/:
>>>>>>>>> MANIFEST.MF
>>>>>>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
>>>>>>>>> res:
>>>>>>>>> tables
>>>>>>>>>
>>>>>>>>> res/tables:
>>>>>>>>> readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
>>>>>>>>> stemmer_200.out  stemmer_500.out  stemmer_700.out
>>>>>>>>>
>>>>>>>>> 2010/11/2 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Jakub,
>>>>>>>>>>
>>>>>>>>>> if you unzip your stempel-1.0.jar do you have the
>>>>>>>>>> required directory structure and file in there?
>>>>>>>>>> org/getopt/stempel/lucene/StempelFilter.class
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Bernd
>>>>>>>>>>
>>>>>>>>>> Am 02.11.2010 13:54, schrieb Jakub Godawa:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Erick I've put the jar files like that before. I also added the
>>>>>>>>>>> directive and put the file in instanceDir/lib
>>>>>>>>>>>
>>>>>>>>>>> What is still a problem is that even the files are loaded:
>>>>>>>>>>> 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader
>>>>>>>>>>> replaceClassLoader
>>>>>>>>>>> INFO: Adding
>>>>>>>>>>> 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
>>>>>>>>>>> to classloader
>>>>>>>>>>>
>>>>>>>>>>> I am not able to use the FilterFactory... maybe I am attempting it
>>>>>>>>>>> in
>>>>>>>>>>> a wrong way?
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Jakub Godawa.
>>>>>>>>>>>
>>>>>>>>>>> 2010/11/2 Erick Erickson<erickerick...@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The polish stemmer jar file needs to be findable by Solr, if you
>>>>>>>>>>>> copy
>>>>>>>>>>>> it to<solr_home>/lib and restart solr you should be set.
>>>>>>>>>>>>
>>>>>>>>>>>> Alternatively, you can add another<lib>  directive to the
>>>>>>>>>>>> solrconfig.xml
>>>>>>>>>>>> file
>>>>>>>>>>>> (there are several examples in that file already).
>>>>>>>>>>>>
>>>>>>>>>>>> I'm a little confused about not being able to find TokenFilter,
>>>>>>>>>>>> is that
>>>>>>>>>>>> still
>>>>>>>>>>>> a problem?
>>>>>>>>>>>>
>>>>>>>>>>>> HTH
>>>>>>>>>>>> Erick
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Nov 2, 2010 at 8:07 AM, Jakub
>>>>>>>>>>>> Godawa<jakub.god...@gmail.com>  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you Bernd! I couldn't make it run though. Here is my
>>>>>>>>>>>>> problem:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
>>>>>>>>>>>>> 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is
>>>>>>>>>>>>> a
>>>>>>>>>>>>> directive:<lib path="../lib/stempel-1.0.jar" />
>>>>>>>>>>>>> 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is
>>>>>>>>>>>>> fieldType:
>>>>>>>>>>>>>
>>>>>>>>>>>>> (...)
>>>>>>>>>>>>>  <!-- Polish -->
>>>>>>>>>>>>>   <fieldType name="text_pl" class="solr.TextField">
>>>>>>>>>>>>>    <analyzer>
>>>>>>>>>>>>>       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>>>>>>>>>>      <filter class="solr.LowerCaseFilterFactory"/>
>>>>>>>>>>>>>      <filter class="org.getopt.stempel.lucene.StempelFilter" />
>>>>>>>>>>>>>      <!--<filter
>>>>>>>>>>>>> class="org.getopt.solr.analysis.StempelTokenFilterFactory"
>>>>>>>>>>>>> protected="protwords.txt" />  -->
>>>>>>>>>>>>>    </analyzer>
>>>>>>>>>>>>>  </fieldType>
>>>>>>>>>>>>> (...)
>>>>>>>>>>>>>
>>>>>>>>>>>>> 4. jar file is loaded but I got an error:
>>>>>>>>>>>>> SEVERE: Could not start SOLR. Check solr/home property
>>>>>>>>>>>>> java.lang.NoClassDefFoundError:
>>>>>>>>>>>>> org/apache/lucene/analysis/TokenFilter
>>>>>>>>>>>>>      at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>>>>>>>>>      at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
>>>>>>>>>>>>>      at
>>>>>>>>>>>>>
>>>>>>>>>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>>>>>>>>>>>> (...)
>>>>>>>>>>>>>
>>>>>>>>>>>>> 5. Different class gave me that one:
>>>>>>>>>>>>> SEVERE: org.apache.solr.common.SolrException: Error loading
>>>>>>>>>>>>> class
>>>>>>>>>>>>> 'org.getopt.solr.analysis.StempelTokenFilterFactory'
>>>>>>>>>>>>>      at
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
>>>>>>>>>>>>>      at
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
>>>>>>>>>>>>> (...)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Question is: How to make<fieldType />  and<filter />  work with
>>>>>>>>>>>>> that
>>>>>>>>>>>>> Stempel? :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Jakub Godawa.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2010/10/29 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Jakub,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have ported the KStemmer for use in most recent Solr trunk
>>>>>>>>>>>>>> version.
>>>>>>>>>>>>>> My stemmer is located in the lib directory of Solr
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> "solr/lib/KStemmer-2.00.jar"
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> because it belongs to Solr.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Write it as FilterFactory and use it as Filter like:
>>>>>>>>>>>>>> <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> protected="protwords.txt" />
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is how my fieldType looks like:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    <fieldType name="text_kstem" class="solr.TextField"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> positionIncrementGap="100">
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      <analyzer type="index">
>>>>>>>>>>>>>>        <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>>>>>>>>>>>        <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> words="stopwords.txt" enablePositionIncrements="false" />
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        <filter class="solr.WordDelimiterFilterFactory"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>>>>>>>>>>> catenateNumbers="1"
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> catenateAll="0" splitOnCaseChange="1" />
>>>>>>>>>>>>>>        <filter class="solr.LowerCaseFilterFactory" />
>>>>>>>>>>>>>>        <filter
>>>>>>>>>>>>>> class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> protected="protwords.txt" />
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"
>>>>>>>>>>>>>> />
>>>>>>>>>>>>>>      </analyzer>
>>>>>>>>>>>>>>      <analyzer type="query">
>>>>>>>>>>>>>>        <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>>>>>>>>>>>        <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> words="stopwords.txt" />
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        <filter class="solr.WordDelimiterFilterFactory"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="0"
>>>>>>>>>>>>> catenateNumbers="0"
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> catenateAll="0" splitOnCaseChange="1" />
>>>>>>>>>>>>>>        <filter class="solr.LowerCaseFilterFactory" />
>>>>>>>>>>>>>>        <filter
>>>>>>>>>>>>>> class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> protected="protwords.txt" />
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"
>>>>>>>>>>>>>> />
>>>>>>>>>>>>>>      </analyzer>
>>>>>>>>>>>>>>    </fieldType>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Bernd
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 28.10.2010 14:56, schrieb Jakub Godawa:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>> There is a polish stemmer http://www.getopt.org/stempel/ and I
>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>> problems connecting it with solr 1.4.1
>>>>>>>>>>>>>>> Questions:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Where EXACTLY do I put "stemper-1.0.jar" file?
>>>>>>>>>>>>>>> 2. How do I register the file, so I can build a fieldType
>>>>>>>>>>>>>>> like:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> <fieldType name="text_pl" class="solr.TextField">
>>>>>>>>>>>>>>>   <analyzer
>>>>>>>>>>>>>>> class="org.geoopt.solr.analysis.StempelTokenFilterFactory"/>
>>>>>>>>>>>>>>> </fieldType>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3. Is that the right approach to make it work?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for verbose explanation,
>>>>>>>>>>>>>>> Jakub.
>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Lance Norskog
>>>>>> goks...@gmail.com
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Lance Norskog
>>>> goks...@gmail.com
>>>>
>>>>
>>
>

Reply via email to