[ 
https://issues.apache.org/jira/browse/LUCENE-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838459#action_12838459
 ] 

Robert Muir commented on LUCENE-2286:
-------------------------------------

ok, i will commit in a few days if no one objects. In my opinion the backwards 
break is the easiest way to go.

in practice it won't hurt existing docs, and if someone is concerned about bad 
ranking (because the newly indexed docs suddenly are ranked better), they can 
turn this off with the boolean until the get a chance to reindex all docs.

> enable DefaultSimilarity.setDiscountOverlaps by default
> -------------------------------------------------------
>
>                 Key: LUCENE-2286
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2286
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Query/Scoring
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 3.1
>
>         Attachments: LUCENE-2286.patch
>
>
> I think we should enable setDiscountOverlaps in DefaultSimilarity by default.
> If you are using synonyms or commongrams or a number of other 
> 0-posInc-term-injecting methods, these currently screw up your length 
> normalization.
> These terms have a position increment of zero, so they shouldnt count towards 
> the length of the document.
> I've done relevance tests with persian showing the difference is significant, 
> and i think its a big trap to anyone using synonyms, etc: your relevance can 
> actually get worse if you don't flip this boolean flag.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to