Hi, 

Pretty novice into SOLR coding, but looking for hints about how (if not already 
done) to implement a PatternTokenizer, that would index this into multivalie 
fields of solr.StrField for facetting. Ex. 

Water -- Irrigation ; Water -- Sewage

should be tokenized into 

Water
Irrigation
Sewage

in multi-valued non-tokenized fields due to performance. I could do it from the 
outside, but I would this as a opportunity to learn about SOLR.

It "works" as I want with the PatternTokenizerFactory when I am using 
solr.TextField, but not when I am using the non-tokenized solr.StrField. But 
according to reading, facets performance is better on non-tokenized fields. We 
need better performance on our faceted searches on these multi-value fields.  
(25 million documents, three multi-valued facets)

I would also need to have a filter that filter out identical values as the 
feeds have redundant data as shown above.

Can anyone point point me in the right direction..

cheers, 
:-Dennis 

Reply via email to