[
https://issues.apache.org/jira/browse/LUCENE-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967409#comment-15967409
]
Alessandro Benedetti commented on LUCENE-7776:
----------------------------------------------
Good one Tommaso!
I have been working recently on this :
https://issues.apache.org/jira/browse/LUCENE-7498
The modification itself is not big but part of the task has been a consistent
refactor and introduction of testing for the more like this component ( which
is heavily used by the Knn classifiers) .
I understand the patch will be quite big ( and probably boring to review) but
if we finalize it, it will open the possibility of an easy extension and
improvement for the more like this.
I will update the Jira issue with a Pull Request and the details related what
is in there and the benefits in the next days, feel free to review it (
> Switch KNN classifier to use BM25 similarity
> --------------------------------------------
>
> Key: LUCENE-7776
> URL: https://issues.apache.org/jira/browse/LUCENE-7776
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/classification
> Reporter: Tommaso Teofili
> Assignee: Tommaso Teofili
> Fix For: master (7.0)
>
>
> It'd be good to use BM25 as default {{Similarity}} for KNN classifier.
> Having done some tests on the _20newsgroups_ dataset that resulted in
> improved _f1_ (between 0.10 and 0.15).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]