Hi Staszek, I'll wait your fix. Thank you!
Koji Sekiguchi from iPad2 On 2012/05/20, at 18:18, Stanislaw Osinski <stanis...@osinski.name> wrote: > Hi Koji, > > You're right, the current code overwrites the custom tokenizer though it > shouldn't. LuceneCarrot2TokenizerFactory is there to avoid circular > dependencies (Carrot2 default tokenizer depends on Lucene), but it > shouldn't be an issue with custom tokenizers. > > I'll try to commit a fix later today. Meanwhile, if you have a chance to > recompile the code, a temporary solution would be to hardcode your > tokenizer class into the fragment you pasted: > > BasicPreprocessingPipelineDescriptor.attributeBuilder(initAttributes) > .stemmerFactory(LuceneCarrot2StemmerFactory.class) > .tokenizerFactory(YourCustomTokenizer.class) > .lexicalDataFactory(SolrStopwordsCarrot2LexicalDataFactory.class); > > Staszek > > On Sun, May 20, 2012 at 9:40 AM, Koji Sekiguchi <k...@r.email.ne.jp> wrote: > >> Hello, >> >> As I'd like to use custom ITokenizerFactory, I set the following Carrot2 >> key >> in solrconfig.xml: >> >> <searchComponent name="clustering" >> enable="${solr.clustering.enabled:true}" >> class="solr.clustering.ClusteringComponent" > >> <lst name="engine"> >> <str name="name">default</str> >> : >> <str >> name="PreprocessingPipeline.tokenizerFactory">my.own.TokenizerFactory</str> >> </lst> >> </searchComponent> >> >> But seems that CarrotClusteringEngine overwrites it with >> LuceneCarrot2TokenizerFactory >> in init() method: >> >> BasicPreprocessingPipelineDescriptor.attributeBuilder(initAttributes) >> .stemmerFactory(LuceneCarrot2StemmerFactory.class) >> .tokenizerFactory(LuceneCarrot2TokenizerFactory.class) >> .lexicalDataFactory(SolrStopwordsCarrot2LexicalDataFactory.class); >> >> Am I missing something? >> >> koji >> -- >> Query Log Visualizer for Apache Solr >> http://soleami.com/ >>