RE: Need help with DictionaryCompoundWordTokenFilterFactory

Steven A Rowe Fri, 10 Oct 2008 10:15:47 -0700

Hi Ralf,

On 10/10/2008 at 10:57 AM, Kraus, Ralf | pixelhouse GmbH wrote:
> I am trying to solve the typical german "Donaudampfschiff"-
> problem by using the DictionaryCompoundWordTokenFilter ...
> Anyone can show me how to configure my schema.xml to use the
> DictionaryCompoundWordTokenFilterFactory ???


Minimally, add the following inside the <analyzer> section for your field type:

<filter class="solr.DictionaryCompoundWordTokenFilterFactory"
        dictFile="/path/to/your/dictionary" />

You can also add the following (optional) attributes:

  - "minWordSize" (default: 5)
  - "minSubwordSize" (default: 2)
  - "maxSubwordSize" (default: 15)
  - "onlyLongestMatch" (default: true)

FYI, the compound package summary in the nightly trunk Lucene contrib javadocs 
has some useful information:

<http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/contrib-analyzers/org/apache/lucene/analysis/compound/package-summary.html>

Steve

RE: Need help with DictionaryCompoundWordTokenFilterFactory

Reply via email to