[ 
https://issues.apache.org/jira/browse/SOLR-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597185#action_12597185
 ] 

Otis Gospodnetic commented on SOLR-379:
---------------------------------------

It would be great to have this available in Solr.  Because of Kstem's 
incompatible library, I don't know how we can handle this.  Incompatible 
license really just means we cannot distribute the KStem code (and cannot have 
it in the Lucene/Solr svn repository).  Usually when incompatible licensing is 
a problem we say "modify the build script to download the needed library on 
demand if it's not present locally".  This is what some of the Lucene contrib 
components do, for example.

However, looking at your ZIP file I see:

  -rw-r--r--      2836  15-Oct-2007  17:16:46  
src/java/org/apache/solr/analysis/KStemFilterFactory.java
  -rw-r--r--     42222  15-Oct-2007  16:28:08  
src/java/org/apache/lucene/analysis/KStemmer.java
  -rw-r--r--      4501  15-Oct-2007  17:08:38  
src/java/org/apache/lucene/analysis/KStemFilter.java
  -rw-r--r--     34259  15-Oct-2007  16:28:24  
src/java/org/apache/lucene/analysis/KStemData8.java
  -rw-r--r--     39918  15-Oct-2007  16:28:28  
src/java/org/apache/lucene/analysis/KStemData7.java
  -rw-r--r--     41412  15-Oct-2007  16:28:34  
src/java/org/apache/lucene/analysis/KStemData6.java
  -rw-r--r--     40457  15-Oct-2007  16:28:40  
src/java/org/apache/lucene/analysis/KStemData5.java
  -rw-r--r--     40823  15-Oct-2007  16:28:44  
src/java/org/apache/lucene/analysis/KStemData4.java
  -rw-r--r--     39808  15-Oct-2007  16:28:50  
src/java/org/apache/lucene/analysis/KStemData3.java
  -rw-r--r--     42696  15-Oct-2007  16:29:00  
src/java/org/apache/lucene/analysis/KStemData2.java
  -rw-r--r--     40020  15-Oct-2007  16:29:14  
src/java/org/apache/lucene/analysis/KStemData1.java

But this is really just a duplicate of what's in 
http://ciir.cs.umass.edu/downloads/files/KStem.jar, plus the Solr-specific 
KStemFilterFactory.java.

So, could we simply download KStem.jar on demand?  And is 
KStemFilterFactory.java really copyright CIIR?  If we can change that to ASL 
then we can include it in the repo and with the modified build that downloads 
KStem.jar before compiling this class would compile.


> KStem Token Filter
> ------------------
>
>                 Key: SOLR-379
>                 URL: https://issues.apache.org/jira/browse/SOLR-379
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Pieter Berkel
>            Priority: Minor
>         Attachments: KStemSolr.zip
>
>
> A Lucene / Solr implementation of the KStem stemmer.  Full credit goes to 
> Harry Wagner for adapting the Lucene version found here:
> http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi
> Background discussion to this stemmer (including licensing issues) can be 
> found in this thread:
> http://www.nabble.com/Embedded-about-50--faster-for-indexing-tf4325720.html#a12376295
> I've made some minor changes to KStemFilterFactory so that it compiles 
> cleanly against trunk:
> 1) removed some unnecessary imports
> 2) changed the init() method parameters introduced by SOLR-215
> 3) moved KStemFilterFactory into package org.apache.solr.analysis
> Once compiled and included in your Solr war (or as a jar in your lib 
> directory, the KStem filter can be used in your schema very easily:
>       <analyzer type="index">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true" 
> words="stopwords.txt"/>
>         <filter class="solr.StandardFilterFactory"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.KStemFilterFactory" cacheSize="20000"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to