Re: Performance Issue when querying Multivalued fields [SOLR 6.1.0]

Erick Erickson Thu, 22 Sep 2016 20:40:11 -0700

If you can break these up into tokens somehow, that's clearly best. But from the
patterns you show it's not likely. WordDelimiterFactory won't quite
work since it
wouldn't be able to separate ASEF into the token SEF.....


You'll have a _lot_ fewer terms if you don't use edgengram. Try just
using bigrams (i.e. NGramFilterFactory) with both mingram and maxgram set
to 2.

Now you do phrase searches (also automatic) on pairs. So in your example
some of the pairs are:
#o
of
ff
f-

To find off, you search for the _phrase_ "of ff". There'll be some
fiddling here to
make it all work.

Best,
Erick

On Thu, Sep 22, 2016 at 11:49 AM, slee <sleed...@gmail.com> wrote:
> Alex,
>
> You do have a point with EdgeNGramFilterFactory. As requested, I've attached
> a sample screenshotfor your review.
> <http://lucene.472066.n3.nabble.com/file/n4297542/sample.png>
>
> Erick,
>
> Here's my use-case. Assume I have the following term stored in global_Value
> as such:
> - executionvenuetype#*OFF*-FACILITY
> - partyid#B2A*SEF*9AJP5P9OLL1190
>
> Now, I want to retrieve any document matching the term in global_Value that
> contains the keyword: "off" and "sef". With regards to leading wild-card,
> that's intentional. Not a mail issue. These fields typically contains Guid,
> and some financial terms (eg: Bonds, swaps, etc..). If I don't use any
> non-wildcard, then it's an exact match. But my use-case dictates that it
> should retrieve if it's a partial match.
>
> So what's my best bet for analyzer in such cases ?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Performance-Issue-when-querying-Multivalued-fields-SOLR-6-1-0-tp4297255p4297542.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Performance Issue when querying Multivalued fields [SOLR 6.1.0]

Reply via email to