Hi,

>>
My Question: Does Lucene use TF/IDF for getting this? (which would mean it does not 
use the boolean model for the boolean query...)
>>

Lucene indeed uses TF/IDF with length normalization for fields and documents. 

However, Lucene is "downward compatible" to the Boolean Model where
documents are represented as 0/1-vectors in Vector Space. Ranking just 
adds weights to the elements of the result set, so the underlying 
interpretation of a query result can be still that of a 
Propositional/Boolean model. If a document appears in the result, 
its tokens valuate the query (which actually is a propositional 
formula formed over words and phrases) to true. The representation
of documents is more complex in Lucene than required for the Boolean
Model, and as a result, Lucene can efficiently handle phrases and 
proximity searches, but these seem to be compatible extensions -
if you can do it in the Boolean Model, you can do it in Lucene :)

One place where Lucene is not 100% compatible with a basic Boolean Model is that 
full negation is a bit tricky - you can not simply ask for all documents that 
do not contain a certain term unless you also have some term that appears in all 
documents. Not a great deal, really. 

If TF/IDF weighting is a problem to you, the Similarity interface implementation 
allows you 
to remove all references to length normalization and document frequencies.

Regards,

Mit freundlichen Gr��en aus Saarbr�cken

--

Dr.-Ing. Karsten Konrad
Head of Artificial Intelligence Lab

XtraMind Technologies GmbH
Stuhlsatzenhausweg 3
D-66123 Saarbr�cken
Phone: +49 (681) 3025113
Fax: +49 (681) 3025109
[EMAIL PROTECTED]
www.xtramind.com



-----Urspr�ngliche Nachricht-----
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Gesendet: Montag, 1. Dezember 2003 13:11
An: [EMAIL PROTECTED]
Betreff: Real Boolean Model in Lucene?


Hi,

is it possible to use a real boolean model in lucene for searching. When one is using 
the Queryparser with a boolean query (i.e. "dog AND horse") one does get a list of 
documents from the Hits object. However these documents have a ranking (score).

My Question: Does Lucene use TF/IDF for getting this? (which would mean it does not 
use the boolean model for the boolean query...)

How can one use a boolean model search, where the outcome are all score=1 ? Example?

Cheers,
Ralph

-- 
Neu bei GMX: Preissenkung f�r MMS-Versand und FreeMMS!

Ideal f�r alle, die gerne MMS verschicken:
25 FreeMMS/Monat mit GMX TopMail. http://www.gmx.net/de/cgi/produktemail

+++ GMX - die erste Adresse f�r Mail, Message, More! +++


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to