Sorry for the confusing typo -- "towards matches with fewer *fields".* fields
should be search *terms*

On Mon, Apr 13, 2015 at 9:30 PM, Doug Turnbull <
[email protected]> wrote:

> First, note that in Lucene's default similarity there already are two
> biases towards matches with fewer fields. Try to take advantage of those
> before going on a boosting expedition
>
> 1. Each term tends to get converted into a boolean SHOULD clause. Every
> SHOULD clause match gets added to the score. So the fewer matches, the
> lower the score.
>
> 2. For an even stronger bias, Lucene adds **coord** or the coordinating
> factor. If only 1 out of 3 search terms match the field being searched, a
> multiple of 1/3 is applied thus punishing the score. So matches where more
> terms match should have a much higher chance of winning.
>
> If you want to know more, read Lucene's javadocs on similarity:
> https://lucene.apache.org/core/5_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
>
> Huh you're thinking, why doesn't my scenario just work then. What you're
> doing is *cross_field* search. Cross field search is something new to
> Elasticsearch whereby both fields are blended together and treated like a
> single field. So the biasing above applies to the two fields together. If
> you want to know more about cross-field search -- here's an article I
> recently wrote
> http://opensourceconnections.com/blog/2015/03/19/elasticsearch-cross-field-search-is-a-lie/
>
> If you want to actually have a bias towards a field with more matches, I'd
> recommend best_field or most_fields search. They will take both search
> terms to each field first, performing different searches in each field.
> Then they will be combined (either by adding or taking the max score).
>
> Untill I finish the related chapter in the search relevance book I'm
> writing <shameless plug :-p http://manning.com/turnbull> the best place
> to read about these topics are the docs or the online guide. In particular,
> this appears relevant
>
> http://www.elastic.co/guide/en/elasticsearch/guide/master/multi-field-search.html
>
> Hope that helps
>
>
> On Mon, Apr 13, 2015 at 7:30 PM, Andre Dantas Rocha <
> [email protected]> wrote:
>
>> Hi there,
>>
>> I have the following query:
>>
>> "query": {
>>   "multi_match": {
>>     "operator": "and",
>>     "type": "cross_fields",
>>     "query": "john smith",
>>     "fields": ["name", "address"]
>>   }
>> }
>>
>> That will match these documents:
>>
>> Name: James *Smith*
>> Address: 325 *John* Street
>>
>> Name: *John Smith* Junior
>> Address: 100 Baryl Street
>>
>> Is there a way to give the last document a higher score since the terms
>> "john" "smith" have two matches on the same field?
>>
>> Notice that behavior is a little bit different from the one using
>> match_phrase with slop because the query can still match terms in any of
>> the fields but score higher when there are more matches on the same field.
>>
>> Thanks,
>>
>> Andre
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/cc76f51b-3721-4978-a3ed-e59ff4c8f138%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/cc76f51b-3721-4978-a3ed-e59ff4c8f138%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> LLC | 240.476.9983 | http://www.opensourceconnections.com
> Author: Taming Search <http://manning.com/turnbull> from Manning
> Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>


-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search <http://manning.com/turnbull> from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALG6HL9dr-At%2BxtWqsT6%3D%2BGehKEYsZsp2rvxp%3D9KFqPFbgiUjA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to