Re: More Like This boost

2008-04-24 Thread Jonathan Ariel
Ok. Here it is.
https://issues.apache.org/jira/browse/LUCENE-1272




On Tue, Apr 22, 2008 at 2:24 PM, Francisco Sanmartin [EMAIL PROTECTED]
wrote:

 Yep, it would be nice for  MLT to have this feature, that's why I am trying
 to do it from the querys before sending the query to Solr. These are the
 steps I'm following:

 1. execute a mlt.like() with the text document_example.getTitle() against
 the field Title of all the other documents. This returns a query
 containing the most relevant words in the example_document and in the rest
 of documents in the Title. We will call this query QueryTitle. For example
 QueryTitle = (words^0.4 in^0.3 the^0.56 title^0.65)
 2. execute a mlt.like() with the text document_example.getDescription()
 against the field Description of all the other documents. This returns a
 query containin the most relevant words in the example_document and in the
 rest of documents in the Description. We will call this query
 QueryDescription. For example QueryDescription = (other^0.66 words^0.7
 in^0.33 the^0.49 description^0.43)

 Up to here, everything is possible with the options that offers MLT.

 Now, with the info MLT gave me (QueryTitle and QueryDescription), i want to
 look in Solr for the documents (and more filters) to retrieve the best
 matches. But I want QueryTitle to be more important that QueryDescription,
 for example 70% and 30% respectively. This means that we should do
 QueryTitle^0.70 and QueryDescription^0.30. This means having a query for
 Solr like this:
 (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (other^0.66 words^0.7 in^0.33
 the^0.49 description^0.43)^0.30

 The question is...is Solr able to understand a query boosted who has its
 terms boosted already? (Remember that MLT returns the interesting terms
 boosted). This does make sense? Will the words obtained from a mlt.like() on
 the title be 70% relevant while the words obtained from a mlt.like() on the
 description will be only 30% relevant?

 Of course it would be a nice feature to be able to boost these things
 natively and do only one call to MLT...Don't hesitate to contact me if you
 need any help on developing this feature.


 Thanks!

 Pako

 Erik Hatcher wrote:

 No, the MLT feature does not have that kind of field-specific boosting
 capability.  It sounds like it could be a useful enhancement though.  Of
 course you do get boosts for interesting terms already, but maybe having
 an additional field-specific boost would be a nice touch too.

Erik

 On Apr 22, 2008, at 9:13 AM, Francisco Sanmartin wrote:

 I know that only one query of that type does not change anything. But
 when it's two or more with different boosts, i hope it does. Here is the
 situation:
 My docs have Title and Description. What I want to do is to give more
 relevancy to the morelikethis on the title than on the description. So the
 query would be like this:

 query = (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (words^0.7 in^0.33
 the^0.49 description^0.43)^0.30

 This way, the words in the title are more relevant than the words in the
 description, right?

 Thanks!

 Pako


 Erik Hatcher wrote:


 On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:

 Is it possible to boost the query that MoreLikeThis returns before
 sending it to Solr? I mean, technically is possible, because you can add a
 factor to the whole query but...does it make sense? (Remember that
 MoreLikeThis can already boosts each term inside the query).

 For example, this could be a result of MoreLikeThis (with native
 boosting enabled)

 queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29
 morelikethis^0.67)

 what I want to do is

 queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29
 morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for the whole
 query)


 That last boost wouldn't change the doc ordering at all, so it'd be
 kinda useless.

 What are you trying to accomplish?

Erik








Re: More Like This boost

2008-04-22 Thread Erik Hatcher


On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
Is it possible to boost the query that MoreLikeThis returns before  
sending it to Solr? I mean, technically is possible, because you  
can add a factor to the whole query but...does it make sense?  
(Remember that MoreLikeThis can already boosts each term inside the  
query).


For example, this could be a result of MoreLikeThis (with native  
boosting enabled)


queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)


what I want to do is

queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for the  
whole query)


That last boost wouldn't change the doc ordering at all, so it'd be  
kinda useless.


What are you trying to accomplish?

Erik



Re: More Like This boost

2008-04-22 Thread Francisco Sanmartin
I know that only one query of that type does not change anything. But 
when it's two or more with different boosts, i hope it does. Here is the 
situation:
My docs have Title and Description. What I want to do is to give 
more relevancy to the morelikethis on the title than on the description. 
So the query would be like this:


query = (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (words^0.7 in^0.33 
the^0.49 description^0.43)^0.30


This way, the words in the title are more relevant than the words in the 
description, right?


Thanks!

Pako


Erik Hatcher wrote:


On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
Is it possible to boost the query that MoreLikeThis returns before 
sending it to Solr? I mean, technically is possible, because you can 
add a factor to the whole query but...does it make sense? (Remember 
that MoreLikeThis can already boosts each term inside the query).


For example, this could be a result of MoreLikeThis (with native 
boosting enabled)


queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29 
morelikethis^0.67)


what I want to do is

queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29 
morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for the 
whole query)


That last boost wouldn't change the doc ordering at all, so it'd be 
kinda useless.


What are you trying to accomplish?

Erik






Re: More Like This boost

2008-04-22 Thread Erik Hatcher
No, the MLT feature does not have that kind of field-specific  
boosting capability.  It sounds like it could be a useful enhancement  
though.  Of course you do get boosts for interesting terms already,  
but maybe having an additional field-specific boost would be a nice  
touch too.


Erik

On Apr 22, 2008, at 9:13 AM, Francisco Sanmartin wrote:
I know that only one query of that type does not change anything.  
But when it's two or more with different boosts, i hope it does.  
Here is the situation:
My docs have Title and Description. What I want to do is to  
give more relevancy to the morelikethis on the title than on the  
description. So the query would be like this:


query = (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (words^0.7  
in^0.33 the^0.49 description^0.43)^0.30


This way, the words in the title are more relevant than the words  
in the description, right?


Thanks!

Pako


Erik Hatcher wrote:


On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
Is it possible to boost the query that MoreLikeThis returns  
before sending it to Solr? I mean, technically is possible,  
because you can add a factor to the whole query but...does it  
make sense? (Remember that MoreLikeThis can already boosts each  
term inside the query).


For example, this could be a result of MoreLikeThis (with native  
boosting enabled)


queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)


what I want to do is

queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29  
morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for  
the whole query)


That last boost wouldn't change the doc ordering at all, so it'd  
be kinda useless.


What are you trying to accomplish?

Erik






Re: More Like This boost

2008-04-22 Thread Walter Underwood
It should help to weight the terms with their frequency in the
original document. That will distinguish between two documents
with the same terms, but different focus.

wunder

On 4/22/08 7:46 AM, Erik Hatcher [EMAIL PROTECTED] wrote:

 No, the MLT feature does not have that kind of field-specific
 boosting capability.  It sounds like it could be a useful enhancement
 though.  Of course you do get boosts for interesting terms already,
 but maybe having an additional field-specific boost would be a nice
 touch too.
 
 Erik
 
 On Apr 22, 2008, at 9:13 AM, Francisco Sanmartin wrote:
 I know that only one query of that type does not change anything.
 But when it's two or more with different boosts, i hope it does.
 Here is the situation:
 My docs have Title and Description. What I want to do is to
 give more relevancy to the morelikethis on the title than on the
 description. So the query would be like this:
 
 query = (words^0.4 in^0.3 the^0.56 title^0.65)^0.70 (words^0.7
 in^0.33 the^0.49 description^0.43)^0.30
 
 This way, the words in the title are more relevant than the words
 in the description, right?
 
 Thanks!
 
 Pako
 
 
 Erik Hatcher wrote:
 
 On Apr 21, 2008, at 5:02 PM, Francisco Sanmartin wrote:
 Is it possible to boost the query that MoreLikeThis returns
 before sending it to Solr? I mean, technically is possible,
 because you can add a factor to the whole query but...does it
 make sense? (Remember that MoreLikeThis can already boosts each
 term inside the query).
 
 For example, this could be a result of MoreLikeThis (with native
 boosting enabled)
 
 queryResultMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29
 morelikethis^0.67)
 
 what I want to do is
 
 queryResulltMLT = (this^0.4 is^0.5 a^0.6 query^0.33 of^0.29
 morelikethis^0.67)^0.60  ---(notice the boost of 0.60 for
 the whole query)
 
 That last boost wouldn't change the doc ordering at all, so it'd
 be kinda useless.
 
 What are you trying to accomplish?
 
 Erik