Re: using Carrot2 custom ITokenizerFactory

Koji Sekiguchi Sun, 20 May 2012 04:03:27 -0700

Hi Staszek,

I'll wait your fix. Thank you!


Koji Sekiguchi from iPad2

On 2012/05/20, at 18:18, Stanislaw Osinski <stanis...@osinski.name> wrote:

> Hi Koji,
> 
> You're right, the current code overwrites the custom tokenizer though it
> shouldn't. LuceneCarrot2TokenizerFactory is there to avoid circular
> dependencies (Carrot2 default tokenizer depends on Lucene), but it
> shouldn't be an issue with custom tokenizers.
> 
> I'll try to commit a fix later today. Meanwhile, if you have a chance to
> recompile the code, a temporary solution would be to hardcode your
> tokenizer class into the fragment you pasted:
> 
>   BasicPreprocessingPipelineDescriptor.attributeBuilder(initAttributes)
>       .stemmerFactory(LuceneCarrot2StemmerFactory.class)
>       .tokenizerFactory(YourCustomTokenizer.class)
>       .lexicalDataFactory(SolrStopwordsCarrot2LexicalDataFactory.class);
> 
> Staszek
> 
> On Sun, May 20, 2012 at 9:40 AM, Koji Sekiguchi <k...@r.email.ne.jp> wrote:
> 
>> Hello,
>> 
>> As I'd like to use custom ITokenizerFactory, I set the following Carrot2
>> key
>> in solrconfig.xml:
>> 
>> <searchComponent name="clustering"
>>                  enable="${solr.clustering.enabled:true}"
>>                  class="solr.clustering.ClusteringComponent" >
>>   <lst name="engine">
>>     <str name="name">default</str>
>>        :
>>     <str
>> name="PreprocessingPipeline.tokenizerFactory">my.own.TokenizerFactory</str>
>>   </lst>
>> </searchComponent>
>> 
>> But seems that CarrotClusteringEngine overwrites it with
>> LuceneCarrot2TokenizerFactory
>> in init() method:
>> 
>>   BasicPreprocessingPipelineDescriptor.attributeBuilder(initAttributes)
>>       .stemmerFactory(LuceneCarrot2StemmerFactory.class)
>>       .tokenizerFactory(LuceneCarrot2TokenizerFactory.class)
>>       .lexicalDataFactory(SolrStopwordsCarrot2LexicalDataFactory.class);
>> 
>> Am I missing something?
>> 
>> koji
>> --
>> Query Log Visualizer for Apache Solr
>> http://soleami.com/
>>

Re: using Carrot2 custom ITokenizerFactory

Reply via email to