You can use a CustomScoreQuery wrapping your scored query to multiply the "confidence level" (as a DocValues field in Lucene trunk, or an indexed NumericField with precisionStep=Integer.MAX_VALUE using FieldCache) into the score.
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Deb Lucene [mailto:deb.luc...@gmail.com] > Sent: Wednesday, March 21, 2012 4:48 PM > To: java-user@lucene.apache.org > Subject: Re: A key value field storing > > Hi Ian, > > Thanks for the reply. I am not sure if the bq solution will b able to solve the > problem. Let me explain with an example - > > document 1 - (some text) > IBM - 0.6 > Google - 0.1 > Apple - 0.4 > > Now suppose I index the document based on the "company name" and > "confidence scores" separately and search using the bq where the Numeric > Field search is based on "anything below 0.5" and text = "IBM". Here, by > mistake the document 1 will be chosen (as it has been stored with 0.6, 0.1 and > 0.4). But actually it should not be - as the "IBM" score is 0.6. So in gist - this > problem needs some sort of linking between the company name and the > scores. > > --d > > > > On Wed, Mar 21, 2012 at 10:41 AM, Ian Lea <ian....@gmail.com> wrote: > > > Why do you want to link name and confidence in one field? Store > > confidence as a NumericField and search something like > > > > BooleanQuery bq = new BooleanQuery(); > > Query nameq = parser.parse(...) or whatever Query confq = > > NumericRangeQuery.newXxx(...); bq.add(nameq, ...); bq,add(confq, ...); > > > > and search using bq. > > > > > > -- > > Ian. > > > > > > On Wed, Mar 21, 2012 at 2:20 PM, Deb Lucene <deb.luc...@gmail.com> > wrote: > > > Hi Group, > > > > > > Sorry for cross posting! > > > > > > We need to index a document corpus (news articles) with some meta > > > data features. The meta data are actually company names with some > > > scoring (a double, between 0 to 1). For example, two documents can > > > be - > > > > > > document 1 > > > (some text - say a technical article from NY times). It comes with > > > the metadata like - IBM - 0.5 Google - 0.9 Apple - 0.3 > > > > > > where 0.5, 0.9, 0.3 are some confidence scores for the company names. > > > > > > Similarly, the document 2 is about some IT article and then the meta > > > data are like - IBM - 0.6 Google - 0.1 Apple - 0.4 > > > > > > now we can index the documents based on the contents or the company > > > names easily. But here the problem is we need to create a "field" > > > where the company names and the scores are linked. So that we can > > > search something like - > > > > > > query = where the "company name" (a field) is "IBM" and the scores > > > of IBM is > 0.5. > > > So in that case the document 2 will be retrieved. > > > > > > I am wondering if anyone has ideas about using the company names and > > scores > > > (linked) together as a field. > > > > > > Thanks in advance, > > > > > > --d > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org