Re: Term Boost Threshold

2009-11-13 Thread Jake Mannix
On Fri, Nov 13, 2009 at 4:21 PM, Max Lynch wrote: > Well already, without doing any boosting, documents matching more of the > > terms > > in your query will score higher. If you really want to make this effect > > more > > pronounced, yes, you can boost the more important query terms higher. >

Re: Term Boost Threshold

2009-11-13 Thread Max Lynch
Well already, without doing any boosting, documents matching more of the > terms > in your query will score higher. If you really want to make this effect > more > pronounced, yes, you can boost the more important query terms higher. > > -jake > But there isn't a way to determine exactly what bo

Re: Term Boost Threshold

2009-11-13 Thread Jake Mannix
On Fri, Nov 13, 2009 at 4:02 PM, Max Lynch wrote: > > > Now, I would like to know exactly what term was found. For example, if > a > > > result comes back from the query above, how do I know whether John > Smith > > > was > > > found, or both John Smith and his company, or just John Smith > > Ma

Re: Term Boost Threshold

2009-11-13 Thread Max Lynch
> > Now, I would like to know exactly what term was found. For example, if a > > result comes back from the query above, how do I know whether John Smith > > was > > found, or both John Smith and his company, or just John Smith > Manufacturing > > was found? > > > In general, this is actually very

Re: Term Boost Threshold

2009-11-13 Thread Jake Mannix
On Fri, Nov 13, 2009 at 3:35 PM, Max Lynch wrote: > > query: "San Francisco" "California" +("John Smith" "John Smith > > Manufacturing") > > > > Here the San Fran and CA clauses are optional, and the ("John Smith" OR > > "John Smith Manufacturing") is required. > > > > Thanks Jake, that works nic

Re: Term Boost Threshold

2009-11-13 Thread Max Lynch
> query: "San Francisco" "California" +("John Smith" "John Smith > Manufacturing") > > Here the San Fran and CA clauses are optional, and the ("John Smith" OR > "John Smith Manufacturing") is required. > Thanks Jake, that works nicely. Now, I would like to know exactly what term was found. For e

Re: Term Boost Threshold

2009-11-13 Thread Jake Mannix
Did I do that wrong? I always mess up the AND/OR human-readable form of this - it's clearer when you use +/- unary operators instead: query: "San Francisco" "California" +("John Smith" "John Smith Manufacturing") Here the San Fran and CA clauses are optional, and the ("John Smith" OR "John Smith

Re: Term Boost Threshold

2009-11-13 Thread Max Lynch
> You want a query like > > ("San Francisco" OR "California") AND ("John Smith" OR "John Smith > Manufacturing") > Won't his require San Francisco or California to be present? I do not require them to be, I only require "John Smith" OR "John Smith Manufacturing", but I want to get a bigger scor

Re: Term Boost Threshold

2009-11-13 Thread Jake Mannix
Hi Max, You want a query like ("San Francisco" OR "California") AND ("John Smith" OR "John Smith Manufacturing") essentially? You can give Lucene exactly this query and it will require that either "John Smith" or "John Smith Manufacturing" be present, but will score results which have these

Term Boost Threshold

2009-11-13 Thread Max Lynch
Hi, I am trying to move from a system where I counted the frequency of terms by hand in a highlighter to determine if a result was useful to me. In an earlier post on this list someone suggested I could boost the terms that are useful to me and only accept hits above a certain threshold. However,