FieldMaskingSpanQuery and statistics

2015-04-15 Thread Stephen Wu
In the documentation for FieldMaskingSpanQuery, it says: "Note: as getField() returns the masked field, scoring will be done using the Similarity and collection statistics of the field name supplied, but with the term statistics of the real field. This may lead to exceptions, poor performance,

Re: Stats in CustomScoreProvider + (in)correctness of LMDirichletSimilarity

2015-05-02 Thread Stephen Wu
ionProbabilities into CustomScoreProvider, would be appreciated. stephen On Fri, May 1, 2015 at 11:16 AM, Stephen Wu wrote: > I am having trouble getting collection probabilities for a term to show up > in a CustomScoreQuery/CustomScoreProvider. Basically, I am trying to add a > per-doc

Stats in CustomScoreProvider + (in)correctness of LMDirichletSimilarity

2015-05-02 Thread Stephen Wu
I am having trouble getting collection probabilities for a term to show up in a CustomScoreQuery/CustomScoreProvider. Basically, I am trying to add a per-document weight that amounts to the sum (for each term in the query) of Math.log(collectionProbability). Can anyone help with this? Or feel fr