First, note that in Lucene's default similarity there already are two biases towards matches with fewer fields. Try to take advantage of those before going on a boosting expedition
1. Each term tends to get converted into a boolean SHOULD clause. Every SHOULD clause match gets added to the score. So the fewer matches, the lower the score. 2. For an even stronger bias, Lucene adds **coord** or the coordinating factor. If only 1 out of 3 search terms match the field being searched, a multiple of 1/3 is applied thus punishing the score. So matches where more terms match should have a much higher chance of winning. If you want to know more, read Lucene's javadocs on similarity: https://lucene.apache.org/core/5_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html Huh you're thinking, why doesn't my scenario just work then. What you're doing is *cross_field* search. Cross field search is something new to Elasticsearch whereby both fields are blended together and treated like a single field. So the biasing above applies to the two fields together. If you want to know more about cross-field search -- here's an article I recently wrote http://opensourceconnections.com/blog/2015/03/19/elasticsearch-cross-field-search-is-a-lie/ If you want to actually have a bias towards a field with more matches, I'd recommend best_field or most_fields search. They will take both search terms to each field first, performing different searches in each field. Then they will be combined (either by adding or taking the max score). Untill I finish the related chapter in the search relevance book I'm writing <shameless plug :-p http://manning.com/turnbull> the best place to read about these topics are the docs or the online guide. In particular, this appears relevant http://www.elastic.co/guide/en/elasticsearch/guide/master/multi-field-search.html Hope that helps On Mon, Apr 13, 2015 at 7:30 PM, Andre Dantas Rocha < [email protected]> wrote: > Hi there, > > I have the following query: > > "query": { > "multi_match": { > "operator": "and", > "type": "cross_fields", > "query": "john smith", > "fields": ["name", "address"] > } > } > > That will match these documents: > > Name: James *Smith* > Address: 325 *John* Street > > Name: *John Smith* Junior > Address: 100 Baryl Street > > Is there a way to give the last document a higher score since the terms > "john" "smith" have two matches on the same field? > > Notice that behavior is a little bit different from the one using > match_phrase with slop because the query can still match terms in any of > the fields but score higher when there are more matches on the same field. > > Thanks, > > Andre > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/cc76f51b-3721-4978-a3ed-e59ff4c8f138%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/cc76f51b-3721-4978-a3ed-e59ff4c8f138%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections, LLC | 240.476.9983 | http://www.opensourceconnections.com Author: Taming Search <http://manning.com/turnbull> from Manning Publications This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALG6HL-BjkkULxXKH4WnbMnUBJF2TowjTe%2B51cHaJHE%2B2GBLcw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
