Ricardo Baeza-Yates, Berthier Ribeiro-Neto: Modern Information Retrieval, ACM Press, ISBN 0-201-39829-X

The searching phase is trivial, if you know the basic vector model.
The indexing phase is described on pp 196-199. It is a classic algorithm.

Your queries:
1 - see the archive.
2 - you cannot solve it AFAIK. BTW, you would rather play with the entropy than with frequencies.


-g-

[EMAIL PROTECTED] wrote:

Hello. I'm working in a recent company called Denali which is interested by using Lucene. I have been looking on the official website in order to get information about this but i did'nt found any explanation about how (in details) the index is create and how the search is being made on it .
In fact we would like to add two special query:
-one which could find what are the most frequent term in a document. -one which could find what are the most frequent term associated whith anoter term(for example: for a given term "lucene", we will find "search","moteur","open source",....)
If somebody could indicate where I could find details information not on "how to use Lucene" but "How does it works in details?(algorithme used,...)", it would be nice.
Best regards
Claude Libois



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]








---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to