Re: Is negative boost possible?

2009-10-13 Thread Andrzej Bialecki

Yonik Seeley wrote:

On Mon, Oct 12, 2009 at 12:03 PM, Andrzej Bialecki a...@getopt.org wrote:

Solr never discarded non-positive hits, and now Lucene 2.9 no longer
does either.

Hmm ... The code that I pasted in my previous email uses
Searcher.search(Query, int), which in turn uses search(Query, Filter, int),
and it doesn't return any results if only the first clause is present (the
one with negative boost) even though it's a matching clause.

I think this is related to the fact that in TopScoreDocCollector:48 the
pqTop.score is initialized to 0, and then all results that have lower score
that this are discarded. Perhaps this should be initialized to
Float.MIN_VALUE?


Hmmm, You're actually seeing this with Lucene 2.9?
The HitQueue (subclass of PriorityQueue) is pre-populated with
sentinel objects with scores of -Inf, not zero.


Uhh, sorry, you are right - an early 2.9-dev version of the jar sneaked 
in on my classpath .. I verified now that 2.9.0 returns both positive 
and negative scores with the default TopScoreDocCollector.


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Is negative boost possible?

2009-10-12 Thread Andrzej Bialecki

Yonik Seeley wrote:

On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog goks...@gmail.com wrote:

And the other important
thing to know about boost values is that the dynamic range is about
6-8 bits


That's an index-time boost - an 8 bit float with 5 bits of mantissa
and 3 bits of exponent.
Query time boosts are normal 32 bit floats.


To be more specific: index-time float encoding does not permit negative 
numbers (see SmallFloat), but query-time boosts can be negative, and 
they DO affect the score - see below. BTW, standard Collectors collect 
only results with positive scores, so if you want to collect results 
with negative scores as well then you need to use a custom Collector.


---
BeanShell 2.0b4 - by Pat Niemeyer (p...@pat.net)
bsh % import org.apache.lucene.search.*;
bsh % import org.apache.lucene.index.*;
bsh % import org.apache.lucene.store.*;
bsh % import org.apache.lucene.document.*;
bsh % import org.apache.lucene.analysis.*;
bsh % tq = new TermQuery(new Term(a, b));
bsh % print(tq);
a:b
bsh % tq.setBoost(-1);
bsh % print(tq);
a:b^-1.0
bsh % q = new BooleanQuery();
bsh % tq1 = new TermQuery(new Term(a, c));
bsh % tq1.setBoost(10);
bsh % q.add(tq1, BooleanClause.Occur.SHOULD);
bsh % q.add(tq, BooleanClause.Occur.SHOULD);
bsh % print(q);
a:c^10.0 a:b^-1.0
bsh % dir = new RAMDirectory();
bsh % w = new IndexWriter(dir, new WhitespaceAnalyzer());
bsh % doc = new Document();
bsh % doc.add(new Field(a, b c d, Field.Store.YES, 
Field.Index.ANALYZED));

bsh % w.addDocument(doc);
bsh % w.close();
bsh % r = IndexReader.open(dir);
bsh % is = new IndexSearcher(r);
bsh % td = is.search(q, 10);
bsh % sd = td.scoreDocs;
bsh % print(sd.length);
1
bsh % print(is.explain(q, 0));
0.1373985 = (MATCH) sum of:
  0.15266499 = (MATCH) weight(a:c^10.0 in 0), product of:
0.99503726 = queryWeight(a:c^10.0), product of:
  10.0 = boost
  0.30685282 = idf(docFreq=1, numDocs=1)
  0.32427183 = queryNorm
0.15342641 = (MATCH) fieldWeight(a:c in 0), product of:
  1.0 = tf(termFreq(a:c)=1)
  0.30685282 = idf(docFreq=1, numDocs=1)
  0.5 = fieldNorm(field=a, doc=0)
  -0.0152664995 = (MATCH) weight(a:b^-1.0 in 0), product of:
-0.099503726 = queryWeight(a:b^-1.0), product of:
  -1.0 = boost
  0.30685282 = idf(docFreq=1, numDocs=1)
  0.32427183 = queryNorm
0.15342641 = (MATCH) fieldWeight(a:b in 0), product of:
  1.0 = tf(termFreq(a:b)=1)
  0.30685282 = idf(docFreq=1, numDocs=1)
  0.5 = fieldNorm(field=a, doc=0)

bsh %


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Is negative boost possible?

2009-10-12 Thread Yonik Seeley
On Mon, Oct 12, 2009 at 5:58 AM, Andrzej Bialecki a...@getopt.org wrote:
 BTW, standard Collectors collect only results
 with positive scores, so if you want to collect results with negative scores
 as well then you need to use a custom Collector.

Solr never discarded non-positive hits, and now Lucene 2.9 no longer
does either.

-Yonik


Re: Is negative boost possible?

2009-10-12 Thread Andrzej Bialecki

Yonik Seeley wrote:

On Mon, Oct 12, 2009 at 5:58 AM, Andrzej Bialecki a...@getopt.org wrote:

BTW, standard Collectors collect only results
with positive scores, so if you want to collect results with negative scores
as well then you need to use a custom Collector.


Solr never discarded non-positive hits, and now Lucene 2.9 no longer
does either.


Hmm ... The code that I pasted in my previous email uses 
Searcher.search(Query, int), which in turn uses search(Query, Filter, 
int), and it doesn't return any results if only the first clause is 
present (the one with negative boost) even though it's a matching clause.


I think this is related to the fact that in TopScoreDocCollector:48 the 
pqTop.score is initialized to 0, and then all results that have lower 
score that this are discarded. Perhaps this should be initialized to 
Float.MIN_VALUE?



--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Is negative boost possible?

2009-10-12 Thread Yonik Seeley
On Mon, Oct 12, 2009 at 12:03 PM, Andrzej Bialecki a...@getopt.org wrote:
 Solr never discarded non-positive hits, and now Lucene 2.9 no longer
 does either.

 Hmm ... The code that I pasted in my previous email uses
 Searcher.search(Query, int), which in turn uses search(Query, Filter, int),
 and it doesn't return any results if only the first clause is present (the
 one with negative boost) even though it's a matching clause.

 I think this is related to the fact that in TopScoreDocCollector:48 the
 pqTop.score is initialized to 0, and then all results that have lower score
 that this are discarded. Perhaps this should be initialized to
 Float.MIN_VALUE?

Hmmm, You're actually seeing this with Lucene 2.9?
The HitQueue (subclass of PriorityQueue) is pre-populated with
sentinel objects with scores of -Inf, not zero.

-Yonik
http://www.lucidimagination.com


Re: Is negative boost possible?

2009-10-11 Thread Lance Norskog
I've been told over and over what Koji said - the convention is that
1.0 is the default center of the boost axis. And the other important
thing to know about boost values is that the dynamic range is about
6-8 bits, so use a range of 2.0 4.0 12.0 instead of 100.0 200.0
1200.0.

Lance

On Sat, Oct 10, 2009 at 9:07 PM, ragi raghuveer.kanche...@gmail.com wrote:

 If you dont want to do a pure negative query and just want boost a few
 documents down based on a matching criteria try to use linear function (one
 of the functions available in boost function) with a negative m (slope).
 We could solve our problem this way.


 We wanted to do negatively boost some documents based on certain keywords
 while

 Marc Sturlese wrote:


 :the only way to negative boost is to positively boost the inverse...
 :
 :    (*:* -field1:value_to_penalize)^10

 This will do the job aswell as bq supports pure negative queries (at least
 in trunk):
 bq=-field1:value_to_penalize^10

 http://wiki.apache.org/solr/SolrRelevancyFAQ#head-76e53db8c5fd31133dc3566318d1aad2bb23e07e


 hossman wrote:


 : Use decimal figure less than 1, e.g. 0.5, to express less importance.

 but that's stil la positive boost ... it still increases the scores of
 documents that match.

 the only way to negative boost is to positively boost the inverse...

      (*:* -field1:value_to_penalize)^10

 :  I am looking for a way to assign negative boost to a term in Solr
 query.
 :  Our use scenario is that we want to boost matching documents that are
 :  updated recently and penalize those that have not been updated for a
 long
 :  time.  There are other terms in the query that would affect the
 scores as
 :  well.  For example we construct a query similar to this:
 : 
 :  *:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS
 TO *]^5
 :  lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3
 : 
 :  I notice it's not possible to simply use a negative boosting factor
 in the
 :  query.  Is there any way to achieve such result?
 : 
 :  Regards,
 :  Shi Quan He
 : 
 : 



 -Hoss






 --
 View this message in context: 
 http://www.nabble.com/Is-negative-boost-possible--tp25025775p25840621.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
Lance Norskog
goks...@gmail.com


Re: Is negative boost possible?

2009-10-11 Thread Yonik Seeley
On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog goks...@gmail.com wrote:
 And the other important
 thing to know about boost values is that the dynamic range is about
 6-8 bits

That's an index-time boost - an 8 bit float with 5 bits of mantissa
and 3 bits of exponent.
Query time boosts are normal 32 bit floats.

-Yonik
http://www.lucidimagination.com


Re: Is negative boost possible?

2009-10-10 Thread ragi

If you dont want to do a pure negative query and just want boost a few
documents down based on a matching criteria try to use linear function (one
of the functions available in boost function) with a negative m (slope).
We could solve our problem this way.


We wanted to do negatively boost some documents based on certain keywords
while 

Marc Sturlese wrote:
 
 
 :the only way to negative boost is to positively boost the inverse...
 :
 :(*:* -field1:value_to_penalize)^10
 
 This will do the job aswell as bq supports pure negative queries (at least
 in trunk):
 bq=-field1:value_to_penalize^10
 
 http://wiki.apache.org/solr/SolrRelevancyFAQ#head-76e53db8c5fd31133dc3566318d1aad2bb23e07e
 
 
 hossman wrote:
 
 
 : Use decimal figure less than 1, e.g. 0.5, to express less importance.
 
 but that's stil la positive boost ... it still increases the scores of 
 documents that match.
 
 the only way to negative boost is to positively boost the inverse...
 
  (*:* -field1:value_to_penalize)^10
 
 :  I am looking for a way to assign negative boost to a term in Solr
 query.
 :  Our use scenario is that we want to boost matching documents that are
 :  updated recently and penalize those that have not been updated for a
 long
 :  time.  There are other terms in the query that would affect the
 scores as
 :  well.  For example we construct a query similar to this:
 :  
 :  *:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS
 TO *]^5
 :  lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3
 :  
 :  I notice it's not possible to simply use a negative boosting factor
 in the
 :  query.  Is there any way to achieve such result?
 :  
 :  Regards,
 :  Shi Quan He
 :  
 :
 
 
 
 -Hoss
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Is-negative-boost-possible--tp25025775p25840621.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Is negative boost possible?

2009-08-19 Thread Marc Sturlese


:the only way to negative boost is to positively boost the inverse...
:
:  (*:* -field1:value_to_penalize)^10

This will do the job aswell as bq supports pure negative queries (at least
in trunk):
bq=-field1:value_to_penalize^10

http://wiki.apache.org/solr/SolrRelevancyFAQ#head-76e53db8c5fd31133dc3566318d1aad2bb23e07e


hossman wrote:
 
 
 : Use decimal figure less than 1, e.g. 0.5, to express less importance.
 
 but that's stil la positive boost ... it still increases the scores of 
 documents that match.
 
 the only way to negative boost is to positively boost the inverse...
 
   (*:* -field1:value_to_penalize)^10
 
 :  I am looking for a way to assign negative boost to a term in Solr
 query.
 :  Our use scenario is that we want to boost matching documents that are
 :  updated recently and penalize those that have not been updated for a
 long
 :  time.  There are other terms in the query that would affect the scores
 as
 :  well.  For example we construct a query similar to this:
 :  
 :  *:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO
 *]^5
 :  lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3
 :  
 :  I notice it's not possible to simply use a negative boosting factor in
 the
 :  query.  Is there any way to achieve such result?
 :  
 :  Regards,
 :  Shi Quan He
 :  
 :
 
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Is-negative-boost-possible--tp25025775p25039059.html
Sent from the Solr - User mailing list archive at Nabble.com.



Is negative boost possible?

2009-08-18 Thread Larry He
Hi all,

I am looking for a way to assign negative boost to a term in Solr query.
Our use scenario is that we want to boost matching documents that are
updated recently and penalize those that have not been updated for a long
time.  There are other terms in the query that would affect the scores as
well.  For example we construct a query similar to this:

*:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5
lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3

I notice it's not possible to simply use a negative boosting factor in the
query.  Is there any way to achieve such result?

Regards,
Shi Quan He


Re: Is negative boost possible?

2009-08-18 Thread Koji Sekiguchi

Hi,

Use decimal figure less than 1, e.g. 0.5, to express less importance.

Koji

Larry He wrote:

Hi all,

I am looking for a way to assign negative boost to a term in Solr query.
Our use scenario is that we want to boost matching documents that are
updated recently and penalize those that have not been updated for a long
time.  There are other terms in the query that would affect the scores as
well.  For example we construct a query similar to this:

*:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5
lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3

I notice it's not possible to simply use a negative boosting factor in the
query.  Is there any way to achieve such result?

Regards,
Shi Quan He

  




Re: Is negative boost possible?

2009-08-18 Thread Chris Hostetter

: Use decimal figure less than 1, e.g. 0.5, to express less importance.

but that's stil la positive boost ... it still increases the scores of 
documents that match.

the only way to negative boost is to positively boost the inverse...

(*:* -field1:value_to_penalize)^10

:  I am looking for a way to assign negative boost to a term in Solr query.
:  Our use scenario is that we want to boost matching documents that are
:  updated recently and penalize those that have not been updated for a long
:  time.  There are other terms in the query that would affect the scores as
:  well.  For example we construct a query similar to this:
:  
:  *:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5
:  lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3
:  
:  I notice it's not possible to simply use a negative boosting factor in the
:  query.  Is there any way to achieve such result?
:  
:  Regards,
:  Shi Quan He
:  
:



-Hoss