Re: How to correctly boost results in Solr Dismax query
: Is not particularly helpful. I tried adding adding a bq argument to my : search: : : &bq=media:DVD^2 : : (yes, this is an index of films!) but I find when I start adding more : and more: : : &bq=media:DVD^2&bq=media:BLU-RAY^1.5 : : I find the negative results - e.g. films that are DVD but are not : BLU-RAY get negatively affected in their score. In the end it all seems that shouldn't be happening ... the outermost BooleanQuery (that the main "q" and all of hte "bq" queries are added to) has it's "coordFactor" disabled, so documents aren't penalized for not matching bq caluses. What you may be seeing is that the raw numeric score values you see getting returned by Solr are lower for documents that match "DVD" when you add teh "BLU-RAY" bq ... that's totally possible because *absolute* scores from one query can't be compared to scores from another query -- what's important is that the *relative* order of scores from doc1 and doc2 should be consistent (ie: the score for a doc matching DVD might go down when you add the BLUERAY bq, but the scores for *all* documents not matching BLUERAY should go down some) The important thing to look for is: 1) are DVD docs sorting higher then they would without the DVD bq? 2) are BLURAY docs sorting higher then they would without the BLURAY bq? 3) are two docs that are equivilent except for a DVD?BLUERAY distinction sorting such that the BLURAY doc comes first? ...the answers to all of those should be yes. if you're seeing otherwise, please post the query tostrings for both queries, and the score explanations for the docs in question against both queries. -Hoss
Re: How to correctly boost results in Solr Dismax query
: bq works only with q.alt query and not with q queries. So, in your case you : would be using qf parameter for field boosting, you will have to give both : the fields in qf parameter i.e. both title and media. FWIW: that statement is false. the "boost query" (bq) is added to the query regardless of wether "q" or "q.alt" is ultimately used. if you turn on debugQUery=true and look at your resulting query string, you can see exactly what the resulting query is (parsedQuery) Using the example setup, compare the output from these examples... http://localhost:8983/solr/select/?q.alt=baz&q=solr&defType=dismax&qf=name+cat&bq=foo&debugQuery=true http://localhost:8983/solr/select/?q.alt=solr&q=&defType=dismax&qf=name+cat&bq=foo&debugQuery=true -Hoss
RE: How to correctly boost results in Solr Dismax query
Thank you Dean. I thought I was on the right track with BQ but it was the skewing of results that was frustrating me. I'll try out your suggestion. Cheers, Pete On Mon, 2009-03-16 at 10:29 +0800, Dean Missikowski (Consultant), CLSA wrote: > Hi, > > My experience is that the BQ parameter can be used with any query type. > You can define boosts on the query fields (qf) that are used with the > query terms (q) in your query, AND you can define additional boosts for > fields that are not used with the query terms through the bq or bf > parameters. > > I think the relative weight that assigning a particular boost to a field > via BQ has on the overall scoring needs to take into consideration the > other fields in your query. If you're searching on titles, you might > want to consider setting omitNorms=true (means don't generate length > normalization vectors) for title in your schema.xml, and if you're using > Solr 1.4 omitTf=true (means don't generate term frequency vectors), so > that results aren't skewed by short and long titles, or titles that > contain multiple occurrences of the same term (setting these requires > you to reindex). I think this should have the effect of making BQ boosts > like &bq=media:DVD^2&bq=media:BLU-RAY^1.5 more effective. > > -- Dean > > -Original Message- > From: Pete Smith [mailto:pete.sm...@lovefilm.com] > Sent: 13/03/2009 7:11 PM > To: solr-user@lucene.apache.org > Subject: Re: How to correctly boost results in Solr Dismax query > > Hi, > > On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote: > > bq works only with q.alt query and not with q queries. So, in your > case you > > would be using qf parameter for field boosting, you will have to give > both > > the fields in qf parameter i.e. both title and media. > > > > try this > > > > media^1.0 title^100.0 > > But with that, how will it know to rank media:DVD higher than > media:BLU-RAY? > > Cheers, > Pete > > > > Pete Smith-3 wrote: > > > > > > Hi Amit, > > > > > > Thanks again for your reply. I am understanding it a bit better but > I > > > think it would help if I posted an example. Say I have three > records: > > > > > > > > > 1 > > > BLU-RAY > > > Indiana Jones and the Kingdom of the Crystal > > > Skull > > > > > > > > > 2 > > > DVD > > > Indiana Jones and the Kingdom of the Crystal > > > Skull > > > > > > > > > 3 > > > DVD > > > Casino Royale > > > > > > > > > Now, if I search for indiana: select?q=indiana > > > > > > I want the first two rows to come back (not the third as it does not > > > contain 'indiana'). I would like record 2 to be scored higher than > > > record 1 as it's media type is DVD. > > > > > > At the moment I have in my config: > > > > > > title > > > > > > And i was trying to boost by media having a specific value by using > 'bq' > > > but from what you told me that is incorrect. > > > > > > Cheers, > > > Pete > > > > > > > > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: > > >> Pete, > > >> > > >> Sorry, if wasnt clear. Here is the explanation. > > >> > > >> Suppose you have 2 records and they have films and media as 2 > columns. > > >> > > >> Now first record has values like films="Indiana" and media="blue > ray" > > >> and 2nd record has values like films="Bond" and media="Indiana" > > >> > > >> Values for qf parameters > > >> > > >> media^2.0 films^1.0 > > >> > > >> Now, search for q=Indiana .. it should display both of the records > but > > >> record #2 will display above than the 1st. > > >> > > >> Let me know if you still have questions. > > >> > > >> Cheers, > > >> amit > > >> > > >> > > >> Pete Smith-3 wrote: > > >> > > > >> > Hi Amit, > > >> > > > >> > Thanks very much for your reply. What you said makes things a bit > > >> > clearer but I am still a bit confused. > > >> > > > >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: > > >> >> If you want to boost the records with their fi
Re: How to correctly boost results in Solr Dismax query
Also note that we have an open and related issue on Lucene's bug tracking system. omitTf might get renamed so that it's more clear that positional information is not stored, which prevents phrase queries. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: "Dean Missikowski (Consultant), CLSA" > To: solr-user@lucene.apache.org > Sent: Monday, March 16, 2009 4:46:32 AM > Subject: RE: How to correctly boost results in Solr Dismax query > > If you just discovered the omitTf parameter because of this post, please > be aware that I've not really explained it's purpose properly and note > that using it will prevent phrase queries from working. See this thread > for clarification on it's use here: > http://mail-archives.apache.org/mod_mbox/lucene-java-user/200903.mbox/%3 > c897559.95769...@web50301.mail.re2.yahoo.com%3e > > -- Dean > > -Original Message- > From: Dean Missikowski (Consultant), CLSA > Sent: 16/03/2009 10:30 AM > To: solr-user@lucene.apache.org > Subject: RE: How to correctly boost results in Solr Dismax query > > Hi, > > My experience is that the BQ parameter can be used with any query type. > You can define boosts on the query fields (qf) that are used with the > query terms (q) in your query, AND you can define additional boosts for > fields that are not used with the query terms through the bq or bf > parameters. > > I think the relative weight that assigning a particular boost to a field > via BQ has on the overall scoring needs to take into consideration the > other fields in your query. If you're searching on titles, you might > want to consider setting omitNorms=true (means don't generate length > normalization vectors) for title in your schema.xml, and if you're using > Solr 1.4 omitTf=true (means don't generate term frequency vectors), so > that results aren't skewed by short and long titles, or titles that > contain multiple occurrences of the same term (setting these requires > you to reindex). I think this should have the effect of making BQ boosts > like &bq=media:DVD^2&bq=media:BLU-RAY^1.5 more effective. > > -- Dean > > -Original Message- > From: Pete Smith [mailto:pete.sm...@lovefilm.com] > Sent: 13/03/2009 7:11 PM > To: solr-user@lucene.apache.org > Subject: Re: How to correctly boost results in Solr Dismax query > > Hi, > > On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote: > > bq works only with q.alt query and not with q queries. So, in your > case you > > would be using qf parameter for field boosting, you will have to give > both > > the fields in qf parameter i.e. both title and media. > > > > try this > > > > media^1.0 title^100.0 > > But with that, how will it know to rank media:DVD higher than > media:BLU-RAY? > > Cheers, > Pete > > > > Pete Smith-3 wrote: > > > > > > Hi Amit, > > > > > > Thanks again for your reply. I am understanding it a bit better but > I > > > think it would help if I posted an example. Say I have three > records: > > > > > > > > > 1 > > > BLU-RAY > > > Indiana Jones and the Kingdom of the Crystal > > > Skull > > > > > > > > > 2 > > > DVD > > > Indiana Jones and the Kingdom of the Crystal > > > Skull > > > > > > > > > 3 > > > DVD > > > Casino Royale > > > > > > > > > Now, if I search for indiana: select?q=indiana > > > > > > I want the first two rows to come back (not the third as it does not > > > contain 'indiana'). I would like record 2 to be scored higher than > > > record 1 as it's media type is DVD. > > > > > > At the moment I have in my config: > > > > > > title > > > > > > And i was trying to boost by media having a specific value by using > 'bq' > > > but from what you told me that is incorrect. > > > > > > Cheers, > > > Pete > > > > > > > > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: > > >> Pete, > > >> > > >> Sorry, if wasnt clear. Here is the explanation. > > >> > > >> Suppose you have 2 records and they have films and media as 2 > columns. > > >> > > >> Now first record has values like films="Indiana" and media="blue > ray" > > >> and 2nd record has values like films="Bond" a
RE: How to correctly boost results in Solr Dismax query
If you just discovered the omitTf parameter because of this post, please be aware that I've not really explained it's purpose properly and note that using it will prevent phrase queries from working. See this thread for clarification on it's use here: http://mail-archives.apache.org/mod_mbox/lucene-java-user/200903.mbox/%3 c897559.95769...@web50301.mail.re2.yahoo.com%3e -- Dean -Original Message- From: Dean Missikowski (Consultant), CLSA Sent: 16/03/2009 10:30 AM To: solr-user@lucene.apache.org Subject: RE: How to correctly boost results in Solr Dismax query Hi, My experience is that the BQ parameter can be used with any query type. You can define boosts on the query fields (qf) that are used with the query terms (q) in your query, AND you can define additional boosts for fields that are not used with the query terms through the bq or bf parameters. I think the relative weight that assigning a particular boost to a field via BQ has on the overall scoring needs to take into consideration the other fields in your query. If you're searching on titles, you might want to consider setting omitNorms=true (means don't generate length normalization vectors) for title in your schema.xml, and if you're using Solr 1.4 omitTf=true (means don't generate term frequency vectors), so that results aren't skewed by short and long titles, or titles that contain multiple occurrences of the same term (setting these requires you to reindex). I think this should have the effect of making BQ boosts like &bq=media:DVD^2&bq=media:BLU-RAY^1.5 more effective. -- Dean -Original Message- From: Pete Smith [mailto:pete.sm...@lovefilm.com] Sent: 13/03/2009 7:11 PM To: solr-user@lucene.apache.org Subject: Re: How to correctly boost results in Solr Dismax query Hi, On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote: > bq works only with q.alt query and not with q queries. So, in your case you > would be using qf parameter for field boosting, you will have to give both > the fields in qf parameter i.e. both title and media. > > try this > > media^1.0 title^100.0 But with that, how will it know to rank media:DVD higher than media:BLU-RAY? Cheers, Pete > Pete Smith-3 wrote: > > > > Hi Amit, > > > > Thanks again for your reply. I am understanding it a bit better but I > > think it would help if I posted an example. Say I have three records: > > > > > > 1 > > BLU-RAY > > Indiana Jones and the Kingdom of the Crystal > > Skull > > > > > > 2 > > DVD > > Indiana Jones and the Kingdom of the Crystal > > Skull > > > > > > 3 > > DVD > > Casino Royale > > > > > > Now, if I search for indiana: select?q=indiana > > > > I want the first two rows to come back (not the third as it does not > > contain 'indiana'). I would like record 2 to be scored higher than > > record 1 as it's media type is DVD. > > > > At the moment I have in my config: > > > > title > > > > And i was trying to boost by media having a specific value by using 'bq' > > but from what you told me that is incorrect. > > > > Cheers, > > Pete > > > > > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: > >> Pete, > >> > >> Sorry, if wasnt clear. Here is the explanation. > >> > >> Suppose you have 2 records and they have films and media as 2 columns. > >> > >> Now first record has values like films="Indiana" and media="blue ray" > >> and 2nd record has values like films="Bond" and media="Indiana" > >> > >> Values for qf parameters > >> > >> media^2.0 films^1.0 > >> > >> Now, search for q=Indiana .. it should display both of the records but > >> record #2 will display above than the 1st. > >> > >> Let me know if you still have questions. > >> > >> Cheers, > >> amit > >> > >> > >> Pete Smith-3 wrote: > >> > > >> > Hi Amit, > >> > > >> > Thanks very much for your reply. What you said makes things a bit > >> > clearer but I am still a bit confused. > >> > > >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: > >> >> If you want to boost the records with their field value then you must > >> use > >> >> q > >> >> query parameter instead of q.alt. 'q' parameter actually uses qf > >> >> parameters > >> >> from solrConfig for field boosting. > >> > >
RE: How to correctly boost results in Solr Dismax query
Hi, My experience is that the BQ parameter can be used with any query type. You can define boosts on the query fields (qf) that are used with the query terms (q) in your query, AND you can define additional boosts for fields that are not used with the query terms through the bq or bf parameters. I think the relative weight that assigning a particular boost to a field via BQ has on the overall scoring needs to take into consideration the other fields in your query. If you're searching on titles, you might want to consider setting omitNorms=true (means don't generate length normalization vectors) for title in your schema.xml, and if you're using Solr 1.4 omitTf=true (means don't generate term frequency vectors), so that results aren't skewed by short and long titles, or titles that contain multiple occurrences of the same term (setting these requires you to reindex). I think this should have the effect of making BQ boosts like &bq=media:DVD^2&bq=media:BLU-RAY^1.5 more effective. -- Dean -Original Message- From: Pete Smith [mailto:pete.sm...@lovefilm.com] Sent: 13/03/2009 7:11 PM To: solr-user@lucene.apache.org Subject: Re: How to correctly boost results in Solr Dismax query Hi, On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote: > bq works only with q.alt query and not with q queries. So, in your case you > would be using qf parameter for field boosting, you will have to give both > the fields in qf parameter i.e. both title and media. > > try this > > media^1.0 title^100.0 But with that, how will it know to rank media:DVD higher than media:BLU-RAY? Cheers, Pete > Pete Smith-3 wrote: > > > > Hi Amit, > > > > Thanks again for your reply. I am understanding it a bit better but I > > think it would help if I posted an example. Say I have three records: > > > > > > 1 > > BLU-RAY > > Indiana Jones and the Kingdom of the Crystal > > Skull > > > > > > 2 > > DVD > > Indiana Jones and the Kingdom of the Crystal > > Skull > > > > > > 3 > > DVD > > Casino Royale > > > > > > Now, if I search for indiana: select?q=indiana > > > > I want the first two rows to come back (not the third as it does not > > contain 'indiana'). I would like record 2 to be scored higher than > > record 1 as it's media type is DVD. > > > > At the moment I have in my config: > > > > title > > > > And i was trying to boost by media having a specific value by using 'bq' > > but from what you told me that is incorrect. > > > > Cheers, > > Pete > > > > > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: > >> Pete, > >> > >> Sorry, if wasnt clear. Here is the explanation. > >> > >> Suppose you have 2 records and they have films and media as 2 columns. > >> > >> Now first record has values like films="Indiana" and media="blue ray" > >> and 2nd record has values like films="Bond" and media="Indiana" > >> > >> Values for qf parameters > >> > >> media^2.0 films^1.0 > >> > >> Now, search for q=Indiana .. it should display both of the records but > >> record #2 will display above than the 1st. > >> > >> Let me know if you still have questions. > >> > >> Cheers, > >> amit > >> > >> > >> Pete Smith-3 wrote: > >> > > >> > Hi Amit, > >> > > >> > Thanks very much for your reply. What you said makes things a bit > >> > clearer but I am still a bit confused. > >> > > >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: > >> >> If you want to boost the records with their field value then you must > >> use > >> >> q > >> >> query parameter instead of q.alt. 'q' parameter actually uses qf > >> >> parameters > >> >> from solrConfig for field boosting. > >> > > >> >>From the documentation for Dismax queries, I thought that "q" is simply > >> > a keyword parameter: > >> > > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > >> > q > >> > The guts of the search defining the main "query". This is designed to > >> be > >> > support raw input strings provided by users with no special escaping. > >> > '+' and '-' characters are treated as "mandatory" and "prohibited" > >&g
Re: How to correctly boost results in Solr Dismax query
Hi, On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote: > bq works only with q.alt query and not with q queries. So, in your case you > would be using qf parameter for field boosting, you will have to give both > the fields in qf parameter i.e. both title and media. > > try this > > media^1.0 title^100.0 But with that, how will it know to rank media:DVD higher than media:BLU-RAY? Cheers, Pete > Pete Smith-3 wrote: > > > > Hi Amit, > > > > Thanks again for your reply. I am understanding it a bit better but I > > think it would help if I posted an example. Say I have three records: > > > > > > 1 > > BLU-RAY > > Indiana Jones and the Kingdom of the Crystal > > Skull > > > > > > 2 > > DVD > > Indiana Jones and the Kingdom of the Crystal > > Skull > > > > > > 3 > > DVD > > Casino Royale > > > > > > Now, if I search for indiana: select?q=indiana > > > > I want the first two rows to come back (not the third as it does not > > contain 'indiana'). I would like record 2 to be scored higher than > > record 1 as it's media type is DVD. > > > > At the moment I have in my config: > > > > title > > > > And i was trying to boost by media having a specific value by using 'bq' > > but from what you told me that is incorrect. > > > > Cheers, > > Pete > > > > > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: > >> Pete, > >> > >> Sorry, if wasnt clear. Here is the explanation. > >> > >> Suppose you have 2 records and they have films and media as 2 columns. > >> > >> Now first record has values like films="Indiana" and media="blue ray" > >> and 2nd record has values like films="Bond" and media="Indiana" > >> > >> Values for qf parameters > >> > >> media^2.0 films^1.0 > >> > >> Now, search for q=Indiana .. it should display both of the records but > >> record #2 will display above than the 1st. > >> > >> Let me know if you still have questions. > >> > >> Cheers, > >> amit > >> > >> > >> Pete Smith-3 wrote: > >> > > >> > Hi Amit, > >> > > >> > Thanks very much for your reply. What you said makes things a bit > >> > clearer but I am still a bit confused. > >> > > >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: > >> >> If you want to boost the records with their field value then you must > >> use > >> >> q > >> >> query parameter instead of q.alt. 'q' parameter actually uses qf > >> >> parameters > >> >> from solrConfig for field boosting. > >> > > >> >>From the documentation for Dismax queries, I thought that "q" is simply > >> > a keyword parameter: > >> > > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > >> > q > >> > The guts of the search defining the main "query". This is designed to > >> be > >> > support raw input strings provided by users with no special escaping. > >> > '+' and '-' characters are treated as "mandatory" and "prohibited" > >> > modifiers for the subsequent terms. Text wrapped in balanced quote > >> > characters '"' are treated as phrases, any query containing an odd > >> > number of quote characters is evaluated as if there were no quote > >> > characters at all. Wildcards in this "q" parameter are not supported. > >> > > >> > And I thought 'qf' is a list of fields and boost scores: > >> > > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > >> > qf (Query Fields) > >> > List of fields and the "boosts" to associate with each of them when > >> > building DisjunctionMaxQueries from the user's query. The format > >> > supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which indicates that > >> > fieldOne has a boost of 2.3, fieldTwo has the default boost, and > >> > fieldThree has a boost of 0.4 ... this indicates that matches in > >> > fieldOne are much more significant than matches in fieldTwo, which are > >> > more significant than matches in fieldThree. > >> > > >> > But if I want to, say, search for films with 'indiana' in the title, > >> > with media=DVD scoring higher than media=BLU-RAY then do I need to do > >> > something like: > >> > > >> > solr/select?q=indiana > >> > > >> > And in my config: > >> > > >> > media^2 > >> > > >> > But I don't see where the actual *contents* of the media field would > >> > determine the boost. > >> > > >> > Sorry if I have misunderstood what you mean. > >> > > >> > Cheers, > >> > Pete > >> > > >> >> Pete Smith-3 wrote: > >> >> > > >> >> > Hi, > >> >> > > >> >> > I have managed to build an index in Solr which I can search on > >> keyword, > >> >> > produce facets, query facets etc. This is all working great. I have > >> >> > implemented my search using a dismax query so it searches > >> predetermined > >> >> > fields. > >> >> > > >> >> > However, my results are coming back sorted by score which appears to > >> be > >> >> > calculated by keyword relevancy only. I would like to adjust the > >> score > >> >> > where fields have pre-determined values. I think I can do this with > >> >> > boost query and boost functions but the documentation here: > >> >> > > >> >> > > >> >> > >> http://wiki
Re: How to correctly boost results in Solr Dismax query
Pete, bq works only with q.alt query and not with q queries. So, in your case you would be using qf parameter for field boosting, you will have to give both the fields in qf parameter i.e. both title and media. try this media^1.0 title^100.0 Pete Smith-3 wrote: > > Hi Amit, > > Thanks again for your reply. I am understanding it a bit better but I > think it would help if I posted an example. Say I have three records: > > > 1 > BLU-RAY > Indiana Jones and the Kingdom of the Crystal > Skull > > > 2 > DVD > Indiana Jones and the Kingdom of the Crystal > Skull > > > 3 > DVD > Casino Royale > > > Now, if I search for indiana: select?q=indiana > > I want the first two rows to come back (not the third as it does not > contain 'indiana'). I would like record 2 to be scored higher than > record 1 as it's media type is DVD. > > At the moment I have in my config: > > title > > And i was trying to boost by media having a specific value by using 'bq' > but from what you told me that is incorrect. > > Cheers, > Pete > > > On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: >> Pete, >> >> Sorry, if wasnt clear. Here is the explanation. >> >> Suppose you have 2 records and they have films and media as 2 columns. >> >> Now first record has values like films="Indiana" and media="blue ray" >> and 2nd record has values like films="Bond" and media="Indiana" >> >> Values for qf parameters >> >> media^2.0 films^1.0 >> >> Now, search for q=Indiana .. it should display both of the records but >> record #2 will display above than the 1st. >> >> Let me know if you still have questions. >> >> Cheers, >> amit >> >> >> Pete Smith-3 wrote: >> > >> > Hi Amit, >> > >> > Thanks very much for your reply. What you said makes things a bit >> > clearer but I am still a bit confused. >> > >> > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: >> >> If you want to boost the records with their field value then you must >> use >> >> q >> >> query parameter instead of q.alt. 'q' parameter actually uses qf >> >> parameters >> >> from solrConfig for field boosting. >> > >> >>From the documentation for Dismax queries, I thought that "q" is simply >> > a keyword parameter: >> > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler: >> > q >> > The guts of the search defining the main "query". This is designed to >> be >> > support raw input strings provided by users with no special escaping. >> > '+' and '-' characters are treated as "mandatory" and "prohibited" >> > modifiers for the subsequent terms. Text wrapped in balanced quote >> > characters '"' are treated as phrases, any query containing an odd >> > number of quote characters is evaluated as if there were no quote >> > characters at all. Wildcards in this "q" parameter are not supported. >> > >> > And I thought 'qf' is a list of fields and boost scores: >> > >> >>From http://wiki.apache.org/solr/DisMaxRequestHandler: >> > qf (Query Fields) >> > List of fields and the "boosts" to associate with each of them when >> > building DisjunctionMaxQueries from the user's query. The format >> > supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which indicates that >> > fieldOne has a boost of 2.3, fieldTwo has the default boost, and >> > fieldThree has a boost of 0.4 ... this indicates that matches in >> > fieldOne are much more significant than matches in fieldTwo, which are >> > more significant than matches in fieldThree. >> > >> > But if I want to, say, search for films with 'indiana' in the title, >> > with media=DVD scoring higher than media=BLU-RAY then do I need to do >> > something like: >> > >> > solr/select?q=indiana >> > >> > And in my config: >> > >> > media^2 >> > >> > But I don't see where the actual *contents* of the media field would >> > determine the boost. >> > >> > Sorry if I have misunderstood what you mean. >> > >> > Cheers, >> > Pete >> > >> >> Pete Smith-3 wrote: >> >> > >> >> > Hi, >> >> > >> >> > I have managed to build an index in Solr which I can search on >> keyword, >> >> > produce facets, query facets etc. This is all working great. I have >> >> > implemented my search using a dismax query so it searches >> predetermined >> >> > fields. >> >> > >> >> > However, my results are coming back sorted by score which appears to >> be >> >> > calculated by keyword relevancy only. I would like to adjust the >> score >> >> > where fields have pre-determined values. I think I can do this with >> >> > boost query and boost functions but the documentation here: >> >> > >> >> > >> >> >> http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 >> >> > >> >> > Is not particularly helpful. I tried adding adding a bq argument to >> my >> >> > search: >> >> > >> >> > &bq=media:DVD^2 >> >> > >> >> > (yes, this is an index of films!) but I find when I start adding >> more >> >> > and more: >> >> > >> >> > &bq=media:DVD^2&bq=media:BLU-RAY^1.5 >> >> > >> >> > I find the negative results - e.g. fi
Re: How to correctly boost results in Solr Dismax query
Hi Amit, Thanks again for your reply. I am understanding it a bit better but I think it would help if I posted an example. Say I have three records: 1 BLU-RAY Indiana Jones and the Kingdom of the Crystal Skull 2 DVD Indiana Jones and the Kingdom of the Crystal Skull 3 DVD Casino Royale Now, if I search for indiana: select?q=indiana I want the first two rows to come back (not the third as it does not contain 'indiana'). I would like record 2 to be scored higher than record 1 as it's media type is DVD. At the moment I have in my config: title And i was trying to boost by media having a specific value by using 'bq' but from what you told me that is incorrect. Cheers, Pete On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote: > Pete, > > Sorry, if wasnt clear. Here is the explanation. > > Suppose you have 2 records and they have films and media as 2 columns. > > Now first record has values like films="Indiana" and media="blue ray" > and 2nd record has values like films="Bond" and media="Indiana" > > Values for qf parameters > > media^2.0 films^1.0 > > Now, search for q=Indiana .. it should display both of the records but > record #2 will display above than the 1st. > > Let me know if you still have questions. > > Cheers, > amit > > > Pete Smith-3 wrote: > > > > Hi Amit, > > > > Thanks very much for your reply. What you said makes things a bit > > clearer but I am still a bit confused. > > > > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: > >> If you want to boost the records with their field value then you must use > >> q > >> query parameter instead of q.alt. 'q' parameter actually uses qf > >> parameters > >> from solrConfig for field boosting. > > > >>From the documentation for Dismax queries, I thought that "q" is simply > > a keyword parameter: > > > >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > > q > > The guts of the search defining the main "query". This is designed to be > > support raw input strings provided by users with no special escaping. > > '+' and '-' characters are treated as "mandatory" and "prohibited" > > modifiers for the subsequent terms. Text wrapped in balanced quote > > characters '"' are treated as phrases, any query containing an odd > > number of quote characters is evaluated as if there were no quote > > characters at all. Wildcards in this "q" parameter are not supported. > > > > And I thought 'qf' is a list of fields and boost scores: > > > >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > > qf (Query Fields) > > List of fields and the "boosts" to associate with each of them when > > building DisjunctionMaxQueries from the user's query. The format > > supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which indicates that > > fieldOne has a boost of 2.3, fieldTwo has the default boost, and > > fieldThree has a boost of 0.4 ... this indicates that matches in > > fieldOne are much more significant than matches in fieldTwo, which are > > more significant than matches in fieldThree. > > > > But if I want to, say, search for films with 'indiana' in the title, > > with media=DVD scoring higher than media=BLU-RAY then do I need to do > > something like: > > > > solr/select?q=indiana > > > > And in my config: > > > > media^2 > > > > But I don't see where the actual *contents* of the media field would > > determine the boost. > > > > Sorry if I have misunderstood what you mean. > > > > Cheers, > > Pete > > > >> Pete Smith-3 wrote: > >> > > >> > Hi, > >> > > >> > I have managed to build an index in Solr which I can search on keyword, > >> > produce facets, query facets etc. This is all working great. I have > >> > implemented my search using a dismax query so it searches predetermined > >> > fields. > >> > > >> > However, my results are coming back sorted by score which appears to be > >> > calculated by keyword relevancy only. I would like to adjust the score > >> > where fields have pre-determined values. I think I can do this with > >> > boost query and boost functions but the documentation here: > >> > > >> > > >> http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 > >> > > >> > Is not particularly helpful. I tried adding adding a bq argument to my > >> > search: > >> > > >> > &bq=media:DVD^2 > >> > > >> > (yes, this is an index of films!) but I find when I start adding more > >> > and more: > >> > > >> > &bq=media:DVD^2&bq=media:BLU-RAY^1.5 > >> > > >> > I find the negative results - e.g. films that are DVD but are not > >> > BLU-RAY get negatively affected in their score. In the end it all seems > >> > to even out and my score is as it was before i started boosting. > >> > > >> > I must be doing this wrong and I wonder whether "boost function" comes > >> > in somewhere. Any ideas on how to correctly use boost? > >> > > >> > Cheers, > >> > Pete > >> > > >> > -- > >> > Pete Smith > >> > Developer > >> > > >> > No.9 | 6 Portal Way | London | W3 6RU | > >> > T: +44 (0
Re: How to correctly boost results in Solr Dismax query
Pete, Sorry, if wasnt clear. Here is the explanation. Suppose you have 2 records and they have films and media as 2 columns. Now first record has values like films="Indiana" and media="blue ray" and 2nd record has values like films="Bond" and media="Indiana" Values for qf parameters media^2.0 films^1.0 Now, search for q=Indiana .. it should display both of the records but record #2 will display above than the 1st. Let me know if you still have questions. Cheers, amit Pete Smith-3 wrote: > > Hi Amit, > > Thanks very much for your reply. What you said makes things a bit > clearer but I am still a bit confused. > > On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: >> If you want to boost the records with their field value then you must use >> q >> query parameter instead of q.alt. 'q' parameter actually uses qf >> parameters >> from solrConfig for field boosting. > >>From the documentation for Dismax queries, I thought that "q" is simply > a keyword parameter: > >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > q > The guts of the search defining the main "query". This is designed to be > support raw input strings provided by users with no special escaping. > '+' and '-' characters are treated as "mandatory" and "prohibited" > modifiers for the subsequent terms. Text wrapped in balanced quote > characters '"' are treated as phrases, any query containing an odd > number of quote characters is evaluated as if there were no quote > characters at all. Wildcards in this "q" parameter are not supported. > > And I thought 'qf' is a list of fields and boost scores: > >>From http://wiki.apache.org/solr/DisMaxRequestHandler: > qf (Query Fields) > List of fields and the "boosts" to associate with each of them when > building DisjunctionMaxQueries from the user's query. The format > supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which indicates that > fieldOne has a boost of 2.3, fieldTwo has the default boost, and > fieldThree has a boost of 0.4 ... this indicates that matches in > fieldOne are much more significant than matches in fieldTwo, which are > more significant than matches in fieldThree. > > But if I want to, say, search for films with 'indiana' in the title, > with media=DVD scoring higher than media=BLU-RAY then do I need to do > something like: > > solr/select?q=indiana > > And in my config: > > media^2 > > But I don't see where the actual *contents* of the media field would > determine the boost. > > Sorry if I have misunderstood what you mean. > > Cheers, > Pete > >> Pete Smith-3 wrote: >> > >> > Hi, >> > >> > I have managed to build an index in Solr which I can search on keyword, >> > produce facets, query facets etc. This is all working great. I have >> > implemented my search using a dismax query so it searches predetermined >> > fields. >> > >> > However, my results are coming back sorted by score which appears to be >> > calculated by keyword relevancy only. I would like to adjust the score >> > where fields have pre-determined values. I think I can do this with >> > boost query and boost functions but the documentation here: >> > >> > >> http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 >> > >> > Is not particularly helpful. I tried adding adding a bq argument to my >> > search: >> > >> > &bq=media:DVD^2 >> > >> > (yes, this is an index of films!) but I find when I start adding more >> > and more: >> > >> > &bq=media:DVD^2&bq=media:BLU-RAY^1.5 >> > >> > I find the negative results - e.g. films that are DVD but are not >> > BLU-RAY get negatively affected in their score. In the end it all seems >> > to even out and my score is as it was before i started boosting. >> > >> > I must be doing this wrong and I wonder whether "boost function" comes >> > in somewhere. Any ideas on how to correctly use boost? >> > >> > Cheers, >> > Pete >> > >> > -- >> > Pete Smith >> > Developer >> > >> > No.9 | 6 Portal Way | London | W3 6RU | >> > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 >> > >> > LOVEFiLM.com >> > >> > >> > -- > Pete Smith > Developer > > No.9 | 6 Portal Way | London | W3 6RU | > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > LOVEFiLM.com > > -- View this message in context: http://www.nabble.com/How-to-correctly-boost-results-in-Solr-Dismax-query-tp22476204p22493646.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to correctly boost results in Solr Dismax query
Hi Amit, Thanks very much for your reply. What you said makes things a bit clearer but I am still a bit confused. On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote: > If you want to boost the records with their field value then you must use q > query parameter instead of q.alt. 'q' parameter actually uses qf parameters > from solrConfig for field boosting. >From the documentation for Dismax queries, I thought that "q" is simply a keyword parameter: >From http://wiki.apache.org/solr/DisMaxRequestHandler: q The guts of the search defining the main "query". This is designed to be support raw input strings provided by users with no special escaping. '+' and '-' characters are treated as "mandatory" and "prohibited" modifiers for the subsequent terms. Text wrapped in balanced quote characters '"' are treated as phrases, any query containing an odd number of quote characters is evaluated as if there were no quote characters at all. Wildcards in this "q" parameter are not supported. And I thought 'qf' is a list of fields and boost scores: >From http://wiki.apache.org/solr/DisMaxRequestHandler: qf (Query Fields) List of fields and the "boosts" to associate with each of them when building DisjunctionMaxQueries from the user's query. The format supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which indicates that fieldOne has a boost of 2.3, fieldTwo has the default boost, and fieldThree has a boost of 0.4 ... this indicates that matches in fieldOne are much more significant than matches in fieldTwo, which are more significant than matches in fieldThree. But if I want to, say, search for films with 'indiana' in the title, with media=DVD scoring higher than media=BLU-RAY then do I need to do something like: solr/select?q=indiana And in my config: media^2 But I don't see where the actual *contents* of the media field would determine the boost. Sorry if I have misunderstood what you mean. Cheers, Pete > Pete Smith-3 wrote: > > > > Hi, > > > > I have managed to build an index in Solr which I can search on keyword, > > produce facets, query facets etc. This is all working great. I have > > implemented my search using a dismax query so it searches predetermined > > fields. > > > > However, my results are coming back sorted by score which appears to be > > calculated by keyword relevancy only. I would like to adjust the score > > where fields have pre-determined values. I think I can do this with > > boost query and boost functions but the documentation here: > > > > http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 > > > > Is not particularly helpful. I tried adding adding a bq argument to my > > search: > > > > &bq=media:DVD^2 > > > > (yes, this is an index of films!) but I find when I start adding more > > and more: > > > > &bq=media:DVD^2&bq=media:BLU-RAY^1.5 > > > > I find the negative results - e.g. films that are DVD but are not > > BLU-RAY get negatively affected in their score. In the end it all seems > > to even out and my score is as it was before i started boosting. > > > > I must be doing this wrong and I wonder whether "boost function" comes > > in somewhere. Any ideas on how to correctly use boost? > > > > Cheers, > > Pete > > > > -- > > Pete Smith > > Developer > > > > No.9 | 6 Portal Way | London | W3 6RU | > > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > > > LOVEFiLM.com > > > > > -- Pete Smith Developer No.9 | 6 Portal Way | London | W3 6RU | T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 LOVEFiLM.com
Re: How to correctly boost results in Solr Dismax query
Hi Pete, bq parameter works with q,alt query parameter. If you are passing the search criteria using q.alt query parameter then this bq parameter comes into picture. Also, q.alt doesnt support field boosting. If you want to boost the records with their field value then you must use q query parameter instead of q.alt. 'q' parameter actually uses qf parameters from solrConfig for field boosting. Let me know if you have any questions. Thanks, Amit Garg Pete Smith-3 wrote: > > Hi, > > I have managed to build an index in Solr which I can search on keyword, > produce facets, query facets etc. This is all working great. I have > implemented my search using a dismax query so it searches predetermined > fields. > > However, my results are coming back sorted by score which appears to be > calculated by keyword relevancy only. I would like to adjust the score > where fields have pre-determined values. I think I can do this with > boost query and boost functions but the documentation here: > > http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 > > Is not particularly helpful. I tried adding adding a bq argument to my > search: > > &bq=media:DVD^2 > > (yes, this is an index of films!) but I find when I start adding more > and more: > > &bq=media:DVD^2&bq=media:BLU-RAY^1.5 > > I find the negative results - e.g. films that are DVD but are not > BLU-RAY get negatively affected in their score. In the end it all seems > to even out and my score is as it was before i started boosting. > > I must be doing this wrong and I wonder whether "boost function" comes > in somewhere. Any ideas on how to correctly use boost? > > Cheers, > Pete > > -- > Pete Smith > Developer > > No.9 | 6 Portal Way | London | W3 6RU | > T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 > > LOVEFiLM.com > > -- View this message in context: http://www.nabble.com/How-to-correctly-boost-results-in-Solr-Dismax-query-tp22476204p22490850.html Sent from the Solr - User mailing list archive at Nabble.com.