RE: QUERYPARSIN BOOSTING

2005-01-12 Thread Karthik N S
Hi Guys

Apologies...

If somebody's is  been closely watching GOOGLE, It boost's WEBSITES for
payed category sites based on search words.

Can This [ boost the Full WEBSITE ] be achieved in Lucene's search  based on
searchword

If So Please Explain /examples  ???.

with regards
karthik



-Original Message-
From: Chuck Williams [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 11, 2005 2:00 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: RE: QUERYPARSIN  BOOSTING


Karthik,

I don't think the boost in your example does much since you are using an
AND query, i.e. all hits will have to contain both vendor:nike and
contents:shoes.  If you used an OR, then the boost would put nike
products above (non-nike) shoes, unless there was some other factor that
causes score of contents:shoes to be 10x greater than that of
vendor:nike.  It's a good idea to look at the results of explain() when
analyzing what's happening with scoring, tuning your boosts and your
Similarity.

Chuck

   -Original Message-
   From: Nader Henein [mailto:[EMAIL PROTECTED]
   Sent: Tuesday, January 11, 2005 12:21 AM
   To: Lucene Users List
   Subject: Re: QUERYPARSIN  BOOSTING
  
From the text on the Lucene Jakarta Site :
   http://jakarta.apache.org/lucene/docs/queryparsersyntax.html
  
  
   Lucene provides the relevance level of matching documents based on
the
   terms found. To boost a term use the caret, ^, symbol with a boost
   factor (a number) at the end of the term you are searching. The
higher
   the boost factor, the more relevant the term will be.
  
   Boosting allows you to control the relevance of a document by
   boosting its term. For example, if you are searching for
  
  
  
  
   jakarta apache
  
  
  
  
   and you want the term jakarta to be more relevant boost it
using
   the ^ symbol along with the boost factor next to the term. You
would
   type:
  
  
  
  
   jakarta^4 apache
  
  
  
  
   This will make documents with the term jakarta appear more
relevant.
   You can also boost Phrase Terms as in the example:
  
  
  
  
   jakarta apache^4 jakarta lucene
  
  
  
  
   By default, the boost factor is 1. Although the boost factor
must be
   positive, it can be less than 1 (e.g. 0.2)
  
  
   Regards.
  
   Nader Henein
  
  
   Karthik N S wrote:
  
   Hi Guys
   
   
   
   Apologies...
   
   This Question may be asked million times on this form ,need some
   clarifications.
   
   1) FieldType =  keyword  name =  vendor
   
   2)FieldType =  text  name = contents
   
   Question:
   
   1) How to Construct a Query which would allow hits  avaliable for
the
   VENDOR
   to  appear  first ?.
   
   2) If boosting is to be applied How TO   ?.
   
   3) Is the Query Constructed Below correct?.
   
   +Contents:shoes +((vendor:nike)^10)
   
   
   
   Please Advise.
   Thx in advance.
   
   
   WITH WARM REGARDS
   HAVE A NICE DAY
   [ N.S.KARTHIK]
   
   
   
  
-
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail:
[EMAIL PROTECTED]
   
   
   
   
   
   
  
  
-
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: QUERYPARSIN BOOSTING

2005-01-12 Thread Erik Hatcher
On Jan 12, 2005, at 5:30 AM, Karthik N S wrote:
If somebody's is  been closely watching GOOGLE, It boost's WEBSITES for
payed category sites based on search words.
Do you have an example of this?  My understanding is Google *separates* 
the display of sponsored sites and ad links (like the one a friend of 
mine registered for me on my name).  Separating is different than 
boosting.

Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: QUERYPARSIN BOOSTING

2005-01-12 Thread Chuck Williams
Google has natural results on the left and sponsored results on the
right.  I do not believe the natural results are affected by paid
keywords at all.  What you seem to be describing is the behavior of the
sponsored results, which I believe are explicitly attached to certain
keywords.

The same approach would work in Lucene.  Create a field to hold
purchased keywords (any keywords you want to associate with the
result).  Then you can include this field in your search with a high
boost (see DistributingMultiFieldQueryParser,
http://issues.apache.org/bugzilla/show_bug.cgi?id=32674).

Google prefers certain results over others for certain keywords based on
various factors of the keyword purchase and the site (amount paid for
the keyword, Page Rank of the site, tenure of the listing, popularity of
the listing, etc.).  You could emulate this in various ways, using a
combination of document/field boosting and perhaps replication of the
term in the field (to increase its tf), or even perhaps multiple fields
that are boosted at different levels.  I'm not sure of the best approach
to this part -- you could experiment a little.

Chuck

   -Original Message-
   From: Karthik N S [mailto:[EMAIL PROTECTED]
   Sent: Wednesday, January 12, 2005 2:30 AM
   To: Lucene Users List
   Subject: RE: QUERYPARSIN  BOOSTING
   
   Hi Guys
   
   Apologies...
   
   If somebody's is  been closely watching GOOGLE, It boost's WEBSITES
for
   payed category sites based on search words.
   
   Can This [ boost the Full WEBSITE ] be achieved in Lucene's search
   based on
   searchword
   
   If So Please Explain /examples  ???.
   
   with regards
   karthik
   
   
   
   -Original Message-
   From: Chuck Williams [mailto:[EMAIL PROTECTED]
   Sent: Tuesday, January 11, 2005 2:00 PM
   To: Lucene Users List; [EMAIL PROTECTED]
   Subject: RE: QUERYPARSIN  BOOSTING
   
   
   Karthik,
   
   I don't think the boost in your example does much since you are
using an
   AND query, i.e. all hits will have to contain both vendor:nike and
   contents:shoes.  If you used an OR, then the boost would put nike
   products above (non-nike) shoes, unless there was some other factor
that
   causes score of contents:shoes to be 10x greater than that of
   vendor:nike.  It's a good idea to look at the results of explain()
when
   analyzing what's happening with scoring, tuning your boosts and your
   Similarity.
   
   Chuck
   
  -Original Message-
  From: Nader Henein [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, January 11, 2005 12:21 AM
  To: Lucene Users List
  Subject: Re: QUERYPARSIN  BOOSTING
 
   From the text on the Lucene Jakarta Site :
  http://jakarta.apache.org/lucene/docs/queryparsersyntax.html
 
 
  Lucene provides the relevance level of matching documents based
on
   the
  terms found. To boost a term use the caret, ^, symbol with a
boost
  factor (a number) at the end of the term you are searching. The
   higher
  the boost factor, the more relevant the term will be.
 
  Boosting allows you to control the relevance of a document
by
  boosting its term. For example, if you are searching for
 
 
 
 
  jakarta apache
 
 
 
 
  and you want the term jakarta to be more relevant boost it
   using
  the ^ symbol along with the boost factor next to the term.
You
   would
  type:
 
 
 
 
  jakarta^4 apache
 
 
 
 
  This will make documents with the term jakarta appear more
   relevant.
  You can also boost Phrase Terms as in the example:
 
 
 
 
  jakarta apache^4 jakarta lucene
 
 
 
 
  By default, the boost factor is 1. Although the boost factor
   must be
  positive, it can be less than 1 (e.g. 0.2)
 
 
  Regards.
 
  Nader Henein
 
 
  Karthik N S wrote:
 
  Hi Guys
  
  
  
  Apologies...
  
  This Question may be asked million times on this form ,need
some
  clarifications.
  
  1) FieldType =  keyword  name =  vendor
  
  2)FieldType =  text  name = contents
  
  Question:
  
  1) How to Construct a Query which would allow hits  avaliable
for
   the
  VENDOR
  to  appear  first ?.
  
  2) If boosting is to be applied How TO   ?.
  
  3) Is the Query Constructed Below correct?.
  
  +Contents:shoes +((vendor:nike)^10)
  
  
  
  Please Advise.
  Thx in advance.
  
  
  WITH WARM REGARDS
  HAVE A NICE DAY
  [ N.S.KARTHIK]
  
  
  
 
  
-
  To unsubscribe, e-mail:
[EMAIL PROTECTED]
  For additional commands, e-mail:
   [EMAIL PROTECTED

Re: QUERYPARSIN BOOSTING

2005-01-11 Thread Nader Henein
From the text on the Lucene Jakarta Site : 
http://jakarta.apache.org/lucene/docs/queryparsersyntax.html

Lucene provides the relevance level of matching documents based on the 
terms found. To boost a term use the caret, ^, symbol with a boost 
factor (a number) at the end of the term you are searching. The higher 
the boost factor, the more relevant the term will be.

   Boosting allows you to control the relevance of a document by
   boosting its term. For example, if you are searching for


jakarta apache


   and you want the term jakarta to be more relevant boost it using
   the ^ symbol along with the boost factor next to the term. You would
   type:


jakarta^4 apache


   This will make documents with the term jakarta appear more relevant.
   You can also boost Phrase Terms as in the example:


jakarta apache^4 jakarta lucene


   By default, the boost factor is 1. Although the boost factor must be
   positive, it can be less than 1 (e.g. 0.2)
Regards.
Nader Henein
Karthik N S wrote:
Hi Guys

Apologies...
This Question may be asked million times on this form ,need some
clarifications.
1) FieldType =  keyword  name =  vendor
2)FieldType =  text  name = contents
Question:
1) How to Construct a Query which would allow hits  avaliable for the VENDOR
to  appear  first ?.
2) If boosting is to be applied How TO   ?.
3) Is the Query Constructed Below correct?.
+Contents:shoes +((vendor:nike)^10)

Please Advise.
Thx in advance.
WITH WARM REGARDS
HAVE A NICE DAY
[ N.S.KARTHIK]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: QUERYPARSIN BOOSTING

2005-01-11 Thread Chuck Williams
Karthik,

I don't think the boost in your example does much since you are using an
AND query, i.e. all hits will have to contain both vendor:nike and
contents:shoes.  If you used an OR, then the boost would put nike
products above (non-nike) shoes, unless there was some other factor that
causes score of contents:shoes to be 10x greater than that of
vendor:nike.  It's a good idea to look at the results of explain() when
analyzing what's happening with scoring, tuning your boosts and your
Similarity.

Chuck

   -Original Message-
   From: Nader Henein [mailto:[EMAIL PROTECTED]
   Sent: Tuesday, January 11, 2005 12:21 AM
   To: Lucene Users List
   Subject: Re: QUERYPARSIN  BOOSTING
   
From the text on the Lucene Jakarta Site :
   http://jakarta.apache.org/lucene/docs/queryparsersyntax.html
   
   
   Lucene provides the relevance level of matching documents based on
the
   terms found. To boost a term use the caret, ^, symbol with a boost
   factor (a number) at the end of the term you are searching. The
higher
   the boost factor, the more relevant the term will be.
   
   Boosting allows you to control the relevance of a document by
   boosting its term. For example, if you are searching for
   
   
   
   
   jakarta apache
   
   
   
   
   and you want the term jakarta to be more relevant boost it
using
   the ^ symbol along with the boost factor next to the term. You
would
   type:
   
   
   
   
   jakarta^4 apache
   
   
   
   
   This will make documents with the term jakarta appear more
relevant.
   You can also boost Phrase Terms as in the example:
   
   
   
   
   jakarta apache^4 jakarta lucene
   
   
   
   
   By default, the boost factor is 1. Although the boost factor
must be
   positive, it can be less than 1 (e.g. 0.2)
   
   
   Regards.
   
   Nader Henein
   
   
   Karthik N S wrote:
   
   Hi Guys
   
   
   
   Apologies...
   
   This Question may be asked million times on this form ,need some
   clarifications.
   
   1) FieldType =  keyword  name =  vendor
   
   2)FieldType =  text  name = contents
   
   Question:
   
   1) How to Construct a Query which would allow hits  avaliable for
the
   VENDOR
   to  appear  first ?.
   
   2) If boosting is to be applied How TO   ?.
   
   3) Is the Query Constructed Below correct?.
   
   +Contents:shoes +((vendor:nike)^10)
   
   
   
   Please Advise.
   Thx in advance.
   
   
   WITH WARM REGARDS
   HAVE A NICE DAY
   [ N.S.KARTHIK]
   
   
   
  
-
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail:
[EMAIL PROTECTED]
   
   
   
   
   
   
   
  
-
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]