Hi Hoss, Good to hear that, I felt a bit fuzzy trying to grasp all the possibilities.
I've read discussion from Doug's proposal for implementing non-scoring Query features, ConstantScoreQuery, Paul's FilteredQuery patch. And in summary options to avoid scoring: 1. There is a consensus that Doug's proposal would be the best way to proceed, but requires some time until we get there. 2. Filters are perfect for what they do good, filtering. But using them for reimplementing BooleanQuery, mirroring everything with filters would introduce a lot of redundancy. BooleanQuery for exmple does a lot of cool optimizations, shortcicuting expresins... and more or less the same things would have to be reimplemented using filters. 3. ConstantScoreQuery: I am a bit unsure here, but this looks like a bridge that enables Filters to enter "regular" Query world. 4. Paul's patch on FilteredQuery. This is "just" an optimization to avoid unecessary scoring for doc's that do not pass Filter (rather smart one tough). So, back to my use case: You are totally right, ZIP codes are done best with SetFilter (or PrefixFilter), no doubts about this one. And they were the problem actually, so the solution is allredy here. But now when I learned about ConstntScoreQuery I started thinking abot the following option: The problem: The first part of the query (part about thr field "name" in my example) is combining term queries in many strange ways using BooleanQuery, so using ChainedFilter would make thigs not so easy to read, generalize and make right. So, what would you say about the following: 1. Make a TermFilter for all unique, high frequency terms in my query (I have fequency info during construction of the query). Of course, with simple caching at TermFilter level is really simple. 2. wrap those TermFilters in ConstantScoreQuery, 3. combine this inside BooleanQuery as before (Boolean mix of term queries and ConstantScoreQueries) ZIPS field goes into SetFilter Did I allready say "thank you!" for staying with me while asking dumb questions :) And yes, if you get close to Hanover, a good german beer on me is sure thing. --- Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : Wouldn't it make sense to have BooleanFilter, > : TermFilter, MultiTermFilter, RangeFilter... > fammily to > : "mirror" xxxQuery world with same idioms and > : interfaces? Is this the direction allready taken > in > : Lucene development (an alternative would be to > : parametrize existiong Query world). How I see it > : functionaly, at a moment filters (and thir > : combination) are the only way to use fast "pure > : boolean" model. > : > : Does this what I just said makes any sense? > > It makes perfect sense, and you have grasped a ot of > the possibilities. > While making a version of Filter varient of every > Query class is my gut > instinct, there has in fact been discussion about > generalizing Queries so > that they can have "non-scoring" mode. these issues > have all been > mentioned in LUCENE-383 ... > > http://issues.apache.org/jira/browse/LUCENE-383 > > One of the big reasons why it might make sense to > use Queries instead of > Filters even if you don't care about scoring is when > you have a large set > of very restrictive conditions. (ie: A BooleanQuery > consisting of many > TermQueries). the BooleanScorer can make good > decisisons to skip over > large sets of documents -- sometimes ignoring sub > queries entirely -- when > one sub query only matches a few documents because > of hte flexability of > the Scorer API. > > The Filter API on the other hand doesn't have this > flexability. There is > not way for a ChainedFilter/BooleanFilter to know > that it can skip over > one of it's sub filters, or ask one of it's sub > filters to only look at > certain documents. > > > I suggested the full Filter approach for your > situation based on the > following information... > 1) you didn't care about scoring > 2) you were using Range/Prefix queries on teh ZIP > field that could > easily exceed practicle clause limits in > BooleanQuery. > 3) your restrictions on the ZIP field looked like > they could be cached > individually so the and the results reused > accross many searches > > > > -Hoss > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > ___________________________________________________________ To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]