I can't answer all of these questions fully, but since Doug is out, I'll give it a start. Please check the FAQ for more detailed explanation. I believe you will find enough information there to answer all of your questions. The FAQ is linked from the Jakarta's page (there are actually two FAQs so you might want to check both).
As far as I understand, Lucene is a probabilistic indexer. It supports boolean queries but it also supports phrase queries, where it does true ranking. The ranking is done based on how many of the search words appear in a document and how "important" the words are for that document, which is a function of the word frequency and the size of the document. For a given search, the type of result you get depends on the type of Query that is used. For example, boolean queries can have "traditional" AND terms which are all required for a match, but they can also have "optional" terms that rank the document higher if they are found, but do not rule out a document if they are not. I hope this helps. Dmitry. Melissa Mifsud wrote: >Hi again! > >I should really reword my question as follows: > >On which criteria are relevant documents chosen given a particular query > >and > >once retrieved, how are these documents ranked? > >The techniques by which this is done will then determine what type of IR model Lucene >implements. > >Thanks again! > >Melissa > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
