Ahmet Arslan created LUCENE-6818: ------------------------------------ Summary: Implementing Divergence from Independence (DFI) Term-Weighting for Lucene/Solr Key: LUCENE-6818 URL: https://issues.apache.org/jira/browse/LUCENE-6818 Project: Lucene - Core Issue Type: New Feature Components: core/query/scoring Affects Versions: 5.3 Reporter: Ahmet Arslan Priority: Minor Fix For: Trunk
As explained in the [write-up|http://lucidworks.com/blog/flexible-ranking-in-lucene-4], many state-of-the-art ranking model implementations are added to Apache Lucene. This issue aims to include DFI model, which is the non-parametric counterpart of the Divergence from Randomness (DFR) framework. DFI is both parameter-free and non-parametric: * parameter-free: it does not require any parameter tuning or training. * non-parametric: it does not make any assumptions about word frequency distributions on document collections. It is highly recommended *not* to remove stopwords (very common terms: the, of, and, to, a, in, for, is, on, that, etc) with this similarity. For more information see: [A nonparametric term weighting method for information retrieval based on measuring the divergence from independence|http://dx.doi.org/10.1007/s10791-013-9225-4] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org