Sorry for the confusing typo -- "towards matches with fewer *fields".* fields should be search *terms*
On Mon, Apr 13, 2015 at 9:30 PM, Doug Turnbull < [email protected]> wrote: > First, note that in Lucene's default similarity there already are two > biases towards matches with fewer fields. Try to take advantage of those > before going on a boosting expedition > > 1. Each term tends to get converted into a boolean SHOULD clause. Every > SHOULD clause match gets added to the score. So the fewer matches, the > lower the score. > > 2. For an even stronger bias, Lucene adds **coord** or the coordinating > factor. If only 1 out of 3 search terms match the field being searched, a > multiple of 1/3 is applied thus punishing the score. So matches where more > terms match should have a much higher chance of winning. > > If you want to know more, read Lucene's javadocs on similarity: > https://lucene.apache.org/core/5_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > Huh you're thinking, why doesn't my scenario just work then. What you're > doing is *cross_field* search. Cross field search is something new to > Elasticsearch whereby both fields are blended together and treated like a > single field. So the biasing above applies to the two fields together. If > you want to know more about cross-field search -- here's an article I > recently wrote > http://opensourceconnections.com/blog/2015/03/19/elasticsearch-cross-field-search-is-a-lie/ > > If you want to actually have a bias towards a field with more matches, I'd > recommend best_field or most_fields search. They will take both search > terms to each field first, performing different searches in each field. > Then they will be combined (either by adding or taking the max score). > > Untill I finish the related chapter in the search relevance book I'm > writing <shameless plug :-p http://manning.com/turnbull> the best place > to read about these topics are the docs or the online guide. In particular, > this appears relevant > > http://www.elastic.co/guide/en/elasticsearch/guide/master/multi-field-search.html > > Hope that helps > > > On Mon, Apr 13, 2015 at 7:30 PM, Andre Dantas Rocha < > [email protected]> wrote: > >> Hi there, >> >> I have the following query: >> >> "query": { >> "multi_match": { >> "operator": "and", >> "type": "cross_fields", >> "query": "john smith", >> "fields": ["name", "address"] >> } >> } >> >> That will match these documents: >> >> Name: James *Smith* >> Address: 325 *John* Street >> >> Name: *John Smith* Junior >> Address: 100 Baryl Street >> >> Is there a way to give the last document a higher score since the terms >> "john" "smith" have two matches on the same field? >> >> Notice that behavior is a little bit different from the one using >> match_phrase with slop because the query can still match terms in any of >> the fields but score higher when there are more matches on the same field. >> >> Thanks, >> >> Andre >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/cc76f51b-3721-4978-a3ed-e59ff4c8f138%40googlegroups.com >> <https://groups.google.com/d/msgid/elasticsearch/cc76f51b-3721-4978-a3ed-e59ff4c8f138%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- > *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections, > LLC | 240.476.9983 | http://www.opensourceconnections.com > Author: Taming Search <http://manning.com/turnbull> from Manning > Publications > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > > -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections, LLC | 240.476.9983 | http://www.opensourceconnections.com Author: Taming Search <http://manning.com/turnbull> from Manning Publications This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALG6HL9dr-At%2BxtWqsT6%3D%2BGehKEYsZsp2rvxp%3D9KFqPFbgiUjA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
