Hi everybody, I am using the edismax parser and have noticed a very specific behaviour with how sow=true (default) handles multiword keywords.
We have a field called 'keywords', which uses the general KeywordTokenizerFactory. There are also other text fields like title and description. etc. When we index a document with a keyword "ice cream", for example, we know it gets indexed into that field as "ice cream". However, at query time, I noticed that if we run an Edismax query: q=ice cream qf=keywords I do not get that document back as a match. This is due to sow=true splitting the user's query and the final tokens not being present in the keywords field. I was wondering what the best practise around this was? Some thoughts I have had: 1. Index multi-word keywords with hyphens or somelike similar. E.g. "ice cream" -> "ice-cream" 2. Additionally index the separate words as keywords also. E.g. "ice cream" -> "ice cream", "ice", "cream". However this method will result in the loss of intent (q=ice would return this document). 3. Add a boost query which is an edismax query where we explicitly set sow=false and add a huge boost. E.g*. bq={!edismax qf=keywords^1000 sow=false bq="" boost="" pf="" tie=1.00 v="ice cream"}* Is there an industry practise solution to handle this type of problem? Keep in mind that the other text fields may also include these terms. E.g. title="This is ice cream", which would match the query. This specific problem affects the keywords field for the obvious reason that the indexing pipeline does not tokenize keywords. Thank you for all your amazing help, Regards, Ash -- *P.S. We've launched a new blog to share the latest ideas and case studies from our team. Check it out here: product.canva.com <https://product.canva.com/>. *** ** <https://www.canva.com/>Empowering the world to design Also, we're hiring. Apply here! <https://about.canva.com/careers/> <https://twitter.com/canva> <https://facebook.com/canva> <https://au.linkedin.com/company/canva> <https://twitter.com/canva> <https://facebook.com/canva> <https://au.linkedin.com/company/canva> <https://instagram.com/canva>