Re: Document boosting troubles

2010-06-17 Thread MitchK

Hi,



> One problem down, two left!  =)  bf ==> bq did the trick, thanks.  Now at
> least if I can't get the DIH solution working I don't have to tack that on
> every query string. 
> 
I would really recommend to use a boost function. If your rank will change
in future implementations, you do not need to redefine the bq. Besides that,
I think this is not only more comfortable, but also scales better.
The bq-param is more for things like "boost this category" or "boost docs of
an advertisement campaign" or something like that.

I am not sure, since I never worked with the DIH this way, but - from my
logic - the problem could be, that you do not return the row, right?
If you don't, try it again when return row was added to your sourcecode.

Otherwise, I can't help you, since there are no more codeexamples available
at the mailing list (from what I have seen).

Maybe this mailing-list topic helps you: 
http://lucene.472066.n3.nabble.com/Using-DIH-s-special-commands-Help-needed-td475695.html#a475695
Using DIHs special commands Help needed .
There are some suggestions,... however, it seems like he wasn't able to
solve the problem.



> And still can't figure out what I need to do with my dismax querying to
> get scores for quality of match. 
> 
I don't really understand what you mean. Can you explain it a little bit
more?
What, except the $docBoost, does not work as it should do?

Kind regards
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Document-boosting-troubles-tp902982p904129.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Document boosting troubles

2010-06-17 Thread dbashford

One problem down, two left!  =)  bf ==> bq did the trick, thanks.  Now at
least if I can't get the DIH solution working I don't have to tack that on
every query string.

Taking the quotes away from $docBoost results in a syntax error.  Needs to
be quoted.

Changed it up to this and still no luck

var rank = row.get('rank'); 
switch (rank) {
case 1: 
row.put("$docBoost",3.0);   
break;
case 2: 
row.put("$docBoost",2.6);   
break;
case 3:
row.put("$docBoost",2.2);   

break;
case 4: 
row.put("$docBoost",1.8);   
break;
case 5: 
row.put("$docBoost",1.5);   
break;
case 6: 
row.put("$docBoost",1.2);   
break;
case 7:
row.put("$docBoost",0.9);   
break;
case 8: 
row.put("$docBoost",0.7);   
break;
case 9: 
row.put("$docBoost",0.5);   
break;
default:
row.put("$docBoost",0.1);   
}   



And still can't figure out what I need to do with my dismax querying to get
scores for quality of match.  Thoughts?


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Document-boosting-troubles-tp902982p903638.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Document boosting troubles

2010-06-17 Thread MitchK

Sorry, I've overlooked your other question.



>   
> rank:1^10.0 rank:2^9.0 rank:3^8.0 rank:4^7.0 rank:5^6.0 rank:6^5.0
> rank:7^4.0 rank:8^3.0 rank:9^2.0 
>   
> 

This is wrong.
You need to change "bf" to "bq".
Bf -> boosting function
Bq -> boosting query.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Document-boosting-troubles-tp902982p903208.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Document boosting troubles

2010-06-17 Thread MitchK

Hi,

first of all, are you sure that row.put('$docBoost',docBoostVal) is correct?

I think it should be row.put($docBoost,docBoostVal); - unfortunately I am
not sure.

Hm, I think, until you can solve the problem with the docBoosts itself, you
should use a functionQuery.

Use "div(1, rank)" as boost function (bf).

The higher the rank value, the smaller the result.

Hope this helps!
- Mitch

 
dbashford wrote:
> 
> Brand new to this sort of thing so bear with me.
> 
> For sake of simplicity, I've got a two field document, title and rank. 
> Title gets searched on, rank has values from 1 to 10.  1 being highest. 
> What I'd like to do is boost results of searches on title based on the
> documents rank.
> 
> Because it's fairly cut and dry, I was hoping to do it during indexing.  I
> have this in my DIH transformer..
> 
> var docBoostVal = 0;
> switch (rank) {
>   case '1': 
>   docBoostVal = 3.0;
>   break;
>   case '2': 
>   docBoostVal = 2.6;
>   break;
>   case '3': 
>   docBoostVal = 2.2;
>   break;
>   case '4': 
>   docBoostVal = 1.8;
>   break;
>   case '5': 
>   docBoostVal = 1.5;
>   break;
>   case '6': 
>   docBoostVal = 1.2;
>   break;
>   case '7':
>   docBoostVal = 0.9;
>   break;
>   case '8': 
>   docBoostVal = 0.7;
>   break;
>   case '9': 
>   docBoostVal = 0.5;  
>   break;
> } 
> row.put('$docBoost',docBoostVal); 
> 
> It's my understanding that with this, I can simply do the same /select
> queries I've been doing and expect documents to be boosted, but that
> doesn't seem to be happening because I'm seeing things like this in the
> results...
> 
> {"title":"Some title 1",
> "rank":10,
>  "score":0.11726039},
> {"title":"Some title 2",
>  "rank":7,
>  "score":0.11726039},
> 
> Pretty much everything with the same score.  Whatever I'm doing isn't
> making its way through. (To cover my bases I did try the case statement
> with integers rather than strings, same result)
> 
> 
> 
> 
> 
> With that not working I started looking at other options.  Starting
> playing with dismax.  
> 
> I'm able to add this to a query string a get results I'm somewhat
> expecting...
> 
> bq=rank:1^3.0 rank:2^2.6 rank:3^2.2 rank:4^1.8 rank:5^1.5 rank:6^1.2
> rank:7^0.9 rank:8^0.7 rank:9^0.5
> 
> ...but I guess I wasn't expecting it to ONLY rank based on those factors. 
> That essentially gives me a sort by rank.  
> 
> Trying to be super inclusive with the search, so while I'm fiddling my
> mm=1<1.  As expected, a q= like q=red door is returning everything that
> contains Red and door.  But I was hoping that items that matched "red
> door" exactly would sort closer to the top.  And if that exact match was a
> rank 7 that it's score wouldn't be exactly the same as all the other rank
> 7s?  Ditto if I searched for "q=The Tales Of", anything possessing all 3
> terms would sort closer to the top...and possessing two terms behind
> them...and possessing 1 term behind them, and within those groups weight
> heavily on by rank.
> 
> I think I understand that the score is based entirely on the boosts I
> provide...so how do I get something more like what I'm looking for?
> 
> 
> 
> 
> Along those lines, I initially had put something like this in my
> defaults...
> 
>  
> rank:1^10.0 rank:2^9.0 rank:3^8.0 rank:4^7.0 rank:5^6.0 rank:6^5.0
> rank:7^4.0 rank:8^3.0 rank:9^2.0
>  
> 
> ...but that was not working, queries fail with a syntax exception. 
> Guessing this won't work?
> 
> 
> 
> Thanks in advance for any help you can provide.
> 
> 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Document-boosting-troubles-tp902982p903190.html
Sent from the Solr - User mailing list archive at Nabble.com.


Document boosting troubles

2010-06-17 Thread dbashford

Brand new to this sort of thing so bear with me.

For sake of simplicity, I've got a two field document, title and rank. 
Title gets searched on, rank has values from 1 to 10.  1 being highest. 
What I'd like to do is boost results of searches on title based on the
documents rank.

Because it's fairly cut and dry, I was hoping to do it during indexing.  I
have this in my DIH transformer..

var docBoostVal = 0;
switch (rank) {
case '1': 
docBoostVal = 3.0;
break;
case '2': 
docBoostVal = 2.6;
break;
case '3': 
docBoostVal = 2.2;
break;
case '4': 
docBoostVal = 1.8;
break;
case '5': 
docBoostVal = 1.5;
break;
case '6': 
docBoostVal = 1.2;
break;
case '7':
docBoostVal = 0.9;
break;
case '8': 
docBoostVal = 0.7;
break;
case '9': 
docBoostVal = 0.5;  
break;
}   
row.put('$docBoost',docBoostVal); 

It's my understanding that with this, I can simply do the same /select
queries I've been doing and expect documents to be boosted, but that doesn't
seem to be happening because I'm seeing things like this in the results...

{"title":"Some title 1",
"rank":10,
 "score":0.11726039},
{"title":"Some title 2",
 "rank":7,
 "score":0.11726039},

Pretty much everything with the same score.  Whatever I'm doing isn't making
its way through. (To cover my bases I did try the case statement with
integers rather than strings, same result)





With that not working I started looking at other options.  Starting playing
with dismax.  

I'm able to add this to a query string a get results I'm somewhat
expecting...

bq=rank:1^3.0 rank:2^2.6 rank:3^2.2 rank:4^1.8 rank:5^1.5 rank:6^1.2
rank:7^0.9 rank:8^0.7 rank:9^0.5

...but I guess I wasn't expecting it to ONLY rank based on those factors. 
That essentially gives me a sort by rank.  

Trying to be super inclusive with the search, so while I'm fiddling my
mm=1<1.  As expected, a q= like q=red door is returning everything that
contains Red and door.  But I was hoping that items that matched "red door"
exactly would sort closer to the top.  And if that exact match was a rank 7
that it's score wouldn't be exactly the same as all the other rank 7s? 
Ditto if I searched for "q=The Tales Of", anything possessing all 3 terms
would sort closer to the top...and possessing two terms behind them...and
possessing 1 term behind them, and within those groups weight heavily on by
rank.

I think I understand that the score is based entirely on the boosts I
provide...so how do I get something more like what I'm looking for?




Along those lines, I initially had put something like this in my defaults...

 
rank:1^10.0 rank:2^9.0 rank:3^8.0 rank:4^7.0 rank:5^6.0 rank:6^5.0
rank:7^4.0 rank:8^3.0 rank:9^2.0
 

...but that was not working, queries fail with a syntax exception.  Guessing
this won't work?



Thanks in advance for any help you can provide.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Document-boosting-troubles-tp902982p902982.html
Sent from the Solr - User mailing list archive at Nabble.com.