Thanks for the hints! 

Sorry about stealing the thread "query range in multivalued date field" 
Mistakenly responded to it. 

cheers,
:-Dennis 

On 27/01/2011, at 16.48, Erik Hatcher wrote:

> Beyond what Erick said, I'll add that it is often better to "do this from the 
> outside" and send in multiple actual end-user displayable facet values.  When 
> you send in a field like "Water -- Irrigation ; Water -- Sewage", that is 
> what will get stored (if you have it set to stored), but what you might 
> rather want is each individual value stored, which can only be done by the 
> indexer sending in multiple values, not through just tokenization.
> 
>       Erik
> 
> On Jan 27, 2011, at 09:09 , Dennis Schafroth wrote:
> 
>> Hi, 
>> 
>> Pretty novice into SOLR coding, but looking for hints about how (if not 
>> already done) to implement a PatternTokenizer, that would index this into 
>> multivalie fields of solr.StrField for facetting. Ex. 
>> 
>> Water -- Irrigation ; Water -- Sewage
>> 
>> should be tokenized into 
>> 
>> Water
>> Irrigation
>> Sewage
>> 
>> in multi-valued non-tokenized fields due to performance. I could do it from 
>> the outside, but I would this as a opportunity to learn about SOLR.
>> 
>> It "works" as I want with the PatternTokenizerFactory when I am using 
>> solr.TextField, but not when I am using the non-tokenized solr.StrField. But 
>> according to reading, facets performance is better on non-tokenized fields. 
>> We need better performance on our faceted searches on these multi-value 
>> fields.  (25 million documents, three multi-valued facets)
>> 
>> I would also need to have a filter that filter out identical values as the 
>> feeds have redundant data as shown above.
>> 
>> Can anyone point point me in the right direction..
>> 
>> cheers, 
>> :-Dennis
> 
> 

Reply via email to