The searching phase is trivial, if you know the basic vector model. The indexing phase is described on pp 196-199. It is a classic algorithm.
Your queries:
1 - see the archive.
2 - you cannot solve it AFAIK. BTW, you would rather play with the entropy than with frequencies.
-g-
[EMAIL PROTECTED] wrote:
Hello. I'm working in a recent company called Denali which is interested by using Lucene. I have been looking on the official website in order to get information about this but i did'nt found any explanation about how (in details) the index is create and how the search is being made on it .
In fact we would like to add two special query:
-one which could find what are the most frequent term in a document. -one which could find what are the most frequent term associated whith anoter term(for example: for a given term "lucene", we will find "search","moteur","open source",....)
If somebody could indicate where I could find details information not on "how to use Lucene" but "How does it works in details?(algorithme used,...)", it would be nice.
Best regards
Claude Libois
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]