[ 
https://issues.apache.org/jira/browse/LUCENE-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-3395:
-----------------------------

    Attachment: LUCENE-3395.patch

patch containing FreqFilteringScorerWrapper and a test.  I haven't yet done the 
work on TermQuery to add options for this -- wanted to see what people thought 
of it first and get some code review ... been a while since i touched code this 
deep in the stack.

a few things to note:

* entire class is marked experimental since it's whole existence depends on an 
experimental method of the Scorer API.  that said: even if we rip out 
Scorer.freq, i think we can still support this as a TermQuery feature since 
freq info will always be available from TermScorer.
* test currently has some nocommit's related to an NPE when trying to check the 
edge case of wrapping a Scorer that matches nothing.  i think the problem 
relates to some code i cut/paste from TestTermScorer for getting a Scorer from 
a Query+Searcher to use in the test, but it seems to optimize the Scorer to 
null when it matches nothing  (even if i didn't have this NPE, that getScorer 
method would be marked nocommit until someone verified it was in fact a "valid" 
way for a test to get direct access to a  Scorer)

> FreqFilteringScorerWrapper and min/max freq options on TermQuery
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3395
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3395
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Hoss Man
>         Attachments: LUCENE-3395.patch
>
>
> A Solr User was asking about how specify a minimum tf when searching for a 
> term (ie: documents matching "dog" at least 3 times).
> Based on a conversation with rmuir on IRC, that led me to realize that we now 
> explicitly expose a general "freq()" method on Scorer, and that min/max freq 
> constraints could be implemented as a general Scorer Wrapper.
> I propose that we add such a wrapper, and add 
> setMinFreq(float)/setMaxFreq(float) methods to TermQuery (similar to the 
> minNumShouldMatches and disableCoord type setters in BooleanQuery) that cause 
> it to be used automatically.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to