For the third rule, you can omit index norms for a field which will prevent
length normalization. See [1]. The option is either called omit_norms
or norms.enabled depending on your version.

For the second rule, it is slightly more complicated. You can define your
own custom similarity [2] that dictates how the TF, IDF and norms are used.
You simply extends Lucene's DefaultSimilarity (of TDIDFSimilarity) and at
it to elasticsearch's classpath.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-similarity.html

-- 
Ivan


On Sun, Jan 26, 2014 at 11:12 PM, Hiro Gangwani <[email protected]>wrote:

> Dear Team,
>
> I have been looking at search algorithm being used in elastic search and
> found following set of rules which are applied while calculating the score
> (Boolean Model)
>
>
>    - more occurrences in the document are preferred
>    - terms rarer in the corpus are preferred
>    - shorter documents are more heavily weighted
>    - other functions used to adjust score, boosts, etc.
>
> In my application we are doing text based search across set of word
> documents. We would like to assign the higher scroe to documents having
> more occurances and show at the top irrespective of size of document.
> Primarily our application is recruitment system where is search is based
> upon skill sets. So our business team wants to show the resumes having more
> occurrences of search key words at top irrespective of size and rare terms.
> Is there any mechanism to ignore second and third rules as listed below
> and calculate the score based upon More occurrences condition only. We are
> executing search operations using Java API. Please let me know is it
> possible to achieve the same and if yes how?
>
> Thanks in advance for suggesting solution.
>
> Hiro
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f6936b6f-ef7c-4497-b186-bdba28176d89%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA1d7L6ixwNPMtVZ%2BcdsYv8HfAc4CC4gQY%3D%2BavfT-rxEA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to