Hi everybody,

I am using the edismax parser and have noticed a very specific behaviour
with how sow=true (default) handles multiword keywords.

We have a field called 'keywords', which uses the general
KeywordTokenizerFactory. There are also other text fields like title and
description. etc.

When we index a document with a keyword "ice cream", for example, we know
it gets indexed into that field as "ice cream".

However, at query time, I noticed that if we run an Edismax query:
q=ice cream
qf=keywords

I do not get that document back as a match. This is due to sow=true
splitting the user's query and the final tokens not being present in the
keywords field.

I was wondering what the best practise around this was? Some thoughts I
have had:

1. Index multi-word keywords with hyphens or somelike similar. E.g. "ice
cream" -> "ice-cream"
2. Additionally index the separate words as keywords also. E.g. "ice cream"
-> "ice cream", "ice", "cream". However this method will result in the loss
of intent (q=ice would return this document).
3. Add a boost query which is an edismax query where we explicitly set
sow=false and add a huge boost. E.g*. bq={!edismax qf=keywords^1000
sow=false bq="" boost="" pf="" tie=1.00 v="ice cream"}*

Is there an industry practise solution to handle this type of problem? Keep
in mind that the other text fields may also include these terms. E.g.
title="This is ice cream", which would match the query. This specific
problem affects the keywords field for the obvious reason that the indexing
pipeline does not tokenize keywords.

Thank you for all your amazing help,

Regards,

Ash

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
<https://product.canva.com/>. ***
** <https://www.canva.com/>Empowering the 
world to design
Also, we're hiring. Apply here! 
<https://about.canva.com/careers/>
 <https://twitter.com/canva> 
<https://facebook.com/canva> <https://au.linkedin.com/company/canva> 
<https://twitter.com/canva>  <https://facebook.com/canva>  
<https://au.linkedin.com/company/canva>  <https://instagram.com/canva>






Reply via email to