RE: QUERYPARSIN BOOSTING
Hi Guys Apologies... If somebody's is been closely watching GOOGLE, It boost's WEBSITES for payed category sites based on search words. Can This [ boost the Full WEBSITE ] be achieved in Lucene's search based on searchword If So Please Explain /examples ???. with regards karthik -Original Message- From: Chuck Williams [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 11, 2005 2:00 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: RE: QUERYPARSIN BOOSTING Karthik, I don't think the boost in your example does much since you are using an AND query, i.e. all hits will have to contain both vendor:nike and contents:shoes. If you used an OR, then the boost would put nike products above (non-nike) shoes, unless there was some other factor that causes score of contents:shoes to be 10x greater than that of vendor:nike. It's a good idea to look at the results of explain() when analyzing what's happening with scoring, tuning your boosts and your Similarity. Chuck -Original Message- From: Nader Henein [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 11, 2005 12:21 AM To: Lucene Users List Subject: Re: QUERYPARSIN BOOSTING From the text on the Lucene Jakarta Site : http://jakarta.apache.org/lucene/docs/queryparsersyntax.html Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, ^, symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be. Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for jakarta apache and you want the term jakarta to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type: jakarta^4 apache This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: jakarta apache^4 jakarta lucene By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2) Regards. Nader Henein Karthik N S wrote: Hi Guys Apologies... This Question may be asked million times on this form ,need some clarifications. 1) FieldType = keyword name = vendor 2)FieldType = text name = contents Question: 1) How to Construct a Query which would allow hits avaliable for the VENDOR to appear first ?. 2) If boosting is to be applied How TO ?. 3) Is the Query Constructed Below correct?. +Contents:shoes +((vendor:nike)^10) Please Advise. Thx in advance. WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: QUERYPARSIN BOOSTING
On Jan 12, 2005, at 5:30 AM, Karthik N S wrote: If somebody's is been closely watching GOOGLE, It boost's WEBSITES for payed category sites based on search words. Do you have an example of this? My understanding is Google *separates* the display of sponsored sites and ad links (like the one a friend of mine registered for me on my name). Separating is different than boosting. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: QUERYPARSIN BOOSTING
Google has natural results on the left and sponsored results on the right. I do not believe the natural results are affected by paid keywords at all. What you seem to be describing is the behavior of the sponsored results, which I believe are explicitly attached to certain keywords. The same approach would work in Lucene. Create a field to hold purchased keywords (any keywords you want to associate with the result). Then you can include this field in your search with a high boost (see DistributingMultiFieldQueryParser, http://issues.apache.org/bugzilla/show_bug.cgi?id=32674). Google prefers certain results over others for certain keywords based on various factors of the keyword purchase and the site (amount paid for the keyword, Page Rank of the site, tenure of the listing, popularity of the listing, etc.). You could emulate this in various ways, using a combination of document/field boosting and perhaps replication of the term in the field (to increase its tf), or even perhaps multiple fields that are boosted at different levels. I'm not sure of the best approach to this part -- you could experiment a little. Chuck -Original Message- From: Karthik N S [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 12, 2005 2:30 AM To: Lucene Users List Subject: RE: QUERYPARSIN BOOSTING Hi Guys Apologies... If somebody's is been closely watching GOOGLE, It boost's WEBSITES for payed category sites based on search words. Can This [ boost the Full WEBSITE ] be achieved in Lucene's search based on searchword If So Please Explain /examples ???. with regards karthik -Original Message- From: Chuck Williams [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 11, 2005 2:00 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: RE: QUERYPARSIN BOOSTING Karthik, I don't think the boost in your example does much since you are using an AND query, i.e. all hits will have to contain both vendor:nike and contents:shoes. If you used an OR, then the boost would put nike products above (non-nike) shoes, unless there was some other factor that causes score of contents:shoes to be 10x greater than that of vendor:nike. It's a good idea to look at the results of explain() when analyzing what's happening with scoring, tuning your boosts and your Similarity. Chuck -Original Message- From: Nader Henein [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 11, 2005 12:21 AM To: Lucene Users List Subject: Re: QUERYPARSIN BOOSTING From the text on the Lucene Jakarta Site : http://jakarta.apache.org/lucene/docs/queryparsersyntax.html Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, ^, symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be. Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for jakarta apache and you want the term jakarta to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type: jakarta^4 apache This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: jakarta apache^4 jakarta lucene By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2) Regards. Nader Henein Karthik N S wrote: Hi Guys Apologies... This Question may be asked million times on this form ,need some clarifications. 1) FieldType = keyword name = vendor 2)FieldType = text name = contents Question: 1) How to Construct a Query which would allow hits avaliable for the VENDOR to appear first ?. 2) If boosting is to be applied How TO ?. 3) Is the Query Constructed Below correct?. +Contents:shoes +((vendor:nike)^10) Please Advise. Thx in advance. WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED
Re: QUERYPARSIN BOOSTING
From the text on the Lucene Jakarta Site : http://jakarta.apache.org/lucene/docs/queryparsersyntax.html Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, ^, symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be. Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for jakarta apache and you want the term jakarta to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type: jakarta^4 apache This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: jakarta apache^4 jakarta lucene By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2) Regards. Nader Henein Karthik N S wrote: Hi Guys Apologies... This Question may be asked million times on this form ,need some clarifications. 1) FieldType = keyword name = vendor 2)FieldType = text name = contents Question: 1) How to Construct a Query which would allow hits avaliable for the VENDOR to appear first ?. 2) If boosting is to be applied How TO ?. 3) Is the Query Constructed Below correct?. +Contents:shoes +((vendor:nike)^10) Please Advise. Thx in advance. WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: QUERYPARSIN BOOSTING
Karthik, I don't think the boost in your example does much since you are using an AND query, i.e. all hits will have to contain both vendor:nike and contents:shoes. If you used an OR, then the boost would put nike products above (non-nike) shoes, unless there was some other factor that causes score of contents:shoes to be 10x greater than that of vendor:nike. It's a good idea to look at the results of explain() when analyzing what's happening with scoring, tuning your boosts and your Similarity. Chuck -Original Message- From: Nader Henein [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 11, 2005 12:21 AM To: Lucene Users List Subject: Re: QUERYPARSIN BOOSTING From the text on the Lucene Jakarta Site : http://jakarta.apache.org/lucene/docs/queryparsersyntax.html Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, ^, symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be. Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for jakarta apache and you want the term jakarta to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type: jakarta^4 apache This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: jakarta apache^4 jakarta lucene By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2) Regards. Nader Henein Karthik N S wrote: Hi Guys Apologies... This Question may be asked million times on this form ,need some clarifications. 1) FieldType = keyword name = vendor 2)FieldType = text name = contents Question: 1) How to Construct a Query which would allow hits avaliable for the VENDOR to appear first ?. 2) If boosting is to be applied How TO ?. 3) Is the Query Constructed Below correct?. +Contents:shoes +((vendor:nike)^10) Please Advise. Thx in advance. WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]