Hi, Pretty novice into SOLR coding, but looking for hints about how (if not already done) to implement a PatternTokenizer, that would index this into multivalie fields of solr.StrField for facetting. Ex.
Water -- Irrigation ; Water -- Sewage should be tokenized into Water Irrigation Sewage in multi-valued non-tokenized fields due to performance. I could do it from the outside, but I would this as a opportunity to learn about SOLR. It "works" as I want with the PatternTokenizerFactory when I am using solr.TextField, but not when I am using the non-tokenized solr.StrField. But according to reading, facets performance is better on non-tokenized fields. We need better performance on our faceted searches on these multi-value fields. (25 million documents, three multi-valued facets) I would also need to have a filter that filter out identical values as the feeds have redundant data as shown above. Can anyone point point me in the right direction.. cheers, :-Dennis