Re: Is negative boost possible?
Yonik Seeley wrote: On Mon, Oct 12, 2009 at 12:03 PM, Andrzej Bialecki a...@getopt.org wrote: Solr never discarded non-positive hits, and now Lucene 2.9 no longer does either. Hmm ... The code that I pasted in my previous email uses Searcher.search(Query, int), which in turn uses search(Query, Filter, int), and it doesn't return any results if only the first clause is present (the one with negative boost) even though it's a matching clause. I think this is related to the fact that in TopScoreDocCollector:48 the pqTop.score is initialized to 0, and then all results that have lower score that this are discarded. Perhaps this should be initialized to Float.MIN_VALUE? Hmmm, You're actually seeing this with Lucene 2.9? The HitQueue (subclass of PriorityQueue) is pre-populated with sentinel objects with scores of -Inf, not zero. Uhh, sorry, you are right - an early 2.9-dev version of the jar sneaked in on my classpath .. I verified now that 2.9.0 returns both positive and negative scores with the default TopScoreDocCollector. -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Is negative boost possible?
Yonik Seeley wrote: On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog goks...@gmail.com wrote: And the other important thing to know about boost values is that the dynamic range is about 6-8 bits That's an index-time boost - an 8 bit float with 5 bits of mantissa and 3 bits of exponent. Query time boosts are normal 32 bit floats. To be more specific: index-time float encoding does not permit negative numbers (see SmallFloat), but query-time boosts can be negative, and they DO affect the score - see below. BTW, standard Collectors collect only results with positive scores, so if you want to collect results with negative scores as well then you need to use a custom Collector. --- BeanShell 2.0b4 - by Pat Niemeyer (p...@pat.net) bsh % import org.apache.lucene.search.*; bsh % import org.apache.lucene.index.*; bsh % import org.apache.lucene.store.*; bsh % import org.apache.lucene.document.*; bsh % import org.apache.lucene.analysis.*; bsh % tq = new TermQuery(new Term(a, b)); bsh % print(tq); a:b bsh % tq.setBoost(-1); bsh % print(tq); a:b^-1.0 bsh % q = new BooleanQuery(); bsh % tq1 = new TermQuery(new Term(a, c)); bsh % tq1.setBoost(10); bsh % q.add(tq1, BooleanClause.Occur.SHOULD); bsh % q.add(tq, BooleanClause.Occur.SHOULD); bsh % print(q); a:c^10.0 a:b^-1.0 bsh % dir = new RAMDirectory(); bsh % w = new IndexWriter(dir, new WhitespaceAnalyzer()); bsh % doc = new Document(); bsh % doc.add(new Field(a, b c d, Field.Store.YES, Field.Index.ANALYZED)); bsh % w.addDocument(doc); bsh % w.close(); bsh % r = IndexReader.open(dir); bsh % is = new IndexSearcher(r); bsh % td = is.search(q, 10); bsh % sd = td.scoreDocs; bsh % print(sd.length); 1 bsh % print(is.explain(q, 0)); 0.1373985 = (MATCH) sum of: 0.15266499 = (MATCH) weight(a:c^10.0 in 0), product of: 0.99503726 = queryWeight(a:c^10.0), product of: 10.0 = boost 0.30685282 = idf(docFreq=1, numDocs=1) 0.32427183 = queryNorm 0.15342641 = (MATCH) fieldWeight(a:c in 0), product of: 1.0 = tf(termFreq(a:c)=1) 0.30685282 = idf(docFreq=1, numDocs=1) 0.5 = fieldNorm(field=a, doc=0) -0.0152664995 = (MATCH) weight(a:b^-1.0 in 0), product of: -0.099503726 = queryWeight(a:b^-1.0), product of: -1.0 = boost 0.30685282 = idf(docFreq=1, numDocs=1) 0.32427183 = queryNorm 0.15342641 = (MATCH) fieldWeight(a:b in 0), product of: 1.0 = tf(termFreq(a:b)=1) 0.30685282 = idf(docFreq=1, numDocs=1) 0.5 = fieldNorm(field=a, doc=0) bsh % -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Is negative boost possible?
On Mon, Oct 12, 2009 at 5:58 AM, Andrzej Bialecki a...@getopt.org wrote: BTW, standard Collectors collect only results with positive scores, so if you want to collect results with negative scores as well then you need to use a custom Collector. Solr never discarded non-positive hits, and now Lucene 2.9 no longer does either. -Yonik
Re: Is negative boost possible?
Yonik Seeley wrote: On Mon, Oct 12, 2009 at 5:58 AM, Andrzej Bialecki a...@getopt.org wrote: BTW, standard Collectors collect only results with positive scores, so if you want to collect results with negative scores as well then you need to use a custom Collector. Solr never discarded non-positive hits, and now Lucene 2.9 no longer does either. Hmm ... The code that I pasted in my previous email uses Searcher.search(Query, int), which in turn uses search(Query, Filter, int), and it doesn't return any results if only the first clause is present (the one with negative boost) even though it's a matching clause. I think this is related to the fact that in TopScoreDocCollector:48 the pqTop.score is initialized to 0, and then all results that have lower score that this are discarded. Perhaps this should be initialized to Float.MIN_VALUE? -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Is negative boost possible?
On Mon, Oct 12, 2009 at 12:03 PM, Andrzej Bialecki a...@getopt.org wrote: Solr never discarded non-positive hits, and now Lucene 2.9 no longer does either. Hmm ... The code that I pasted in my previous email uses Searcher.search(Query, int), which in turn uses search(Query, Filter, int), and it doesn't return any results if only the first clause is present (the one with negative boost) even though it's a matching clause. I think this is related to the fact that in TopScoreDocCollector:48 the pqTop.score is initialized to 0, and then all results that have lower score that this are discarded. Perhaps this should be initialized to Float.MIN_VALUE? Hmmm, You're actually seeing this with Lucene 2.9? The HitQueue (subclass of PriorityQueue) is pre-populated with sentinel objects with scores of -Inf, not zero. -Yonik http://www.lucidimagination.com
Re: Is negative boost possible?
I've been told over and over what Koji said - the convention is that 1.0 is the default center of the boost axis. And the other important thing to know about boost values is that the dynamic range is about 6-8 bits, so use a range of 2.0 4.0 12.0 instead of 100.0 200.0 1200.0. Lance On Sat, Oct 10, 2009 at 9:07 PM, ragi raghuveer.kanche...@gmail.com wrote: If you dont want to do a pure negative query and just want boost a few documents down based on a matching criteria try to use linear function (one of the functions available in boost function) with a negative m (slope). We could solve our problem this way. We wanted to do negatively boost some documents based on certain keywords while Marc Sturlese wrote: :the only way to negative boost is to positively boost the inverse... : : (*:* -field1:value_to_penalize)^10 This will do the job aswell as bq supports pure negative queries (at least in trunk): bq=-field1:value_to_penalize^10 http://wiki.apache.org/solr/SolrRelevancyFAQ#head-76e53db8c5fd31133dc3566318d1aad2bb23e07e hossman wrote: : Use decimal figure less than 1, e.g. 0.5, to express less importance. but that's stil la positive boost ... it still increases the scores of documents that match. the only way to negative boost is to positively boost the inverse... (*:* -field1:value_to_penalize)^10 : I am looking for a way to assign negative boost to a term in Solr query. : Our use scenario is that we want to boost matching documents that are : updated recently and penalize those that have not been updated for a long : time. There are other terms in the query that would affect the scores as : well. For example we construct a query similar to this: : : *:* field1:value1^2 field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5 : lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3 : : I notice it's not possible to simply use a negative boosting factor in the : query. Is there any way to achieve such result? : : Regards, : Shi Quan He : : -Hoss -- View this message in context: http://www.nabble.com/Is-negative-boost-possible--tp25025775p25840621.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Is negative boost possible?
On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog goks...@gmail.com wrote: And the other important thing to know about boost values is that the dynamic range is about 6-8 bits That's an index-time boost - an 8 bit float with 5 bits of mantissa and 3 bits of exponent. Query time boosts are normal 32 bit floats. -Yonik http://www.lucidimagination.com
Re: Is negative boost possible?
If you dont want to do a pure negative query and just want boost a few documents down based on a matching criteria try to use linear function (one of the functions available in boost function) with a negative m (slope). We could solve our problem this way. We wanted to do negatively boost some documents based on certain keywords while Marc Sturlese wrote: :the only way to negative boost is to positively boost the inverse... : :(*:* -field1:value_to_penalize)^10 This will do the job aswell as bq supports pure negative queries (at least in trunk): bq=-field1:value_to_penalize^10 http://wiki.apache.org/solr/SolrRelevancyFAQ#head-76e53db8c5fd31133dc3566318d1aad2bb23e07e hossman wrote: : Use decimal figure less than 1, e.g. 0.5, to express less importance. but that's stil la positive boost ... it still increases the scores of documents that match. the only way to negative boost is to positively boost the inverse... (*:* -field1:value_to_penalize)^10 : I am looking for a way to assign negative boost to a term in Solr query. : Our use scenario is that we want to boost matching documents that are : updated recently and penalize those that have not been updated for a long : time. There are other terms in the query that would affect the scores as : well. For example we construct a query similar to this: : : *:* field1:value1^2 field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5 : lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3 : : I notice it's not possible to simply use a negative boosting factor in the : query. Is there any way to achieve such result? : : Regards, : Shi Quan He : : -Hoss -- View this message in context: http://www.nabble.com/Is-negative-boost-possible--tp25025775p25840621.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is negative boost possible?
:the only way to negative boost is to positively boost the inverse... : : (*:* -field1:value_to_penalize)^10 This will do the job aswell as bq supports pure negative queries (at least in trunk): bq=-field1:value_to_penalize^10 http://wiki.apache.org/solr/SolrRelevancyFAQ#head-76e53db8c5fd31133dc3566318d1aad2bb23e07e hossman wrote: : Use decimal figure less than 1, e.g. 0.5, to express less importance. but that's stil la positive boost ... it still increases the scores of documents that match. the only way to negative boost is to positively boost the inverse... (*:* -field1:value_to_penalize)^10 : I am looking for a way to assign negative boost to a term in Solr query. : Our use scenario is that we want to boost matching documents that are : updated recently and penalize those that have not been updated for a long : time. There are other terms in the query that would affect the scores as : well. For example we construct a query similar to this: : : *:* field1:value1^2 field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5 : lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3 : : I notice it's not possible to simply use a negative boosting factor in the : query. Is there any way to achieve such result? : : Regards, : Shi Quan He : : -Hoss -- View this message in context: http://www.nabble.com/Is-negative-boost-possible--tp25025775p25039059.html Sent from the Solr - User mailing list archive at Nabble.com.
Is negative boost possible?
Hi all, I am looking for a way to assign negative boost to a term in Solr query. Our use scenario is that we want to boost matching documents that are updated recently and penalize those that have not been updated for a long time. There are other terms in the query that would affect the scores as well. For example we construct a query similar to this: *:* field1:value1^2 field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5 lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3 I notice it's not possible to simply use a negative boosting factor in the query. Is there any way to achieve such result? Regards, Shi Quan He
Re: Is negative boost possible?
Hi, Use decimal figure less than 1, e.g. 0.5, to express less importance. Koji Larry He wrote: Hi all, I am looking for a way to assign negative boost to a term in Solr query. Our use scenario is that we want to boost matching documents that are updated recently and penalize those that have not been updated for a long time. There are other terms in the query that would affect the scores as well. For example we construct a query similar to this: *:* field1:value1^2 field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5 lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3 I notice it's not possible to simply use a negative boosting factor in the query. Is there any way to achieve such result? Regards, Shi Quan He
Re: Is negative boost possible?
: Use decimal figure less than 1, e.g. 0.5, to express less importance. but that's stil la positive boost ... it still increases the scores of documents that match. the only way to negative boost is to positively boost the inverse... (*:* -field1:value_to_penalize)^10 : I am looking for a way to assign negative boost to a term in Solr query. : Our use scenario is that we want to boost matching documents that are : updated recently and penalize those that have not been updated for a long : time. There are other terms in the query that would affect the scores as : well. For example we construct a query similar to this: : : *:* field1:value1^2 field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5 : lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3 : : I notice it's not possible to simply use a negative boosting factor in the : query. Is there any way to achieve such result? : : Regards, : Shi Quan He : : -Hoss