Hmm..doesn't lucene scoring determine how relevant a document is to your query? That is what PageRank and HITS do as well, I believe. Page and document are the same, if you want to index a page you'll obviously try to convert it into a document. PageRank does link analysis to determine how relevant that page is as it relates to the query you entered, does lucene have something similar? How does lucene determine between two documents which one should score higher if they both contain a certain term? Google uses PageRank to make that determination, how does lucene do it?
On 1/22/07, Nicolas Lalevée <[EMAIL PROTECTED]> wrote:
Le Lundi 22 Janvier 2007 19:33, EDMOND KEMOKAI a écrit: > Hi All > This is a question for those familiar with lucene document scoring. How > does it compare with googles PageRank or HITS, or are they very different? > I have being looking at the PageRank algorithm but I'll need to brush-off > my math skills before delving into it:) In fact Lucene is just a search engine. Then you can use the search engine to search in web pages, like Nutch is using Lucene. And Google is more like Nutch : a web crawler plus a web-search engine. So when you are taking about page raking, it has nothing to do with Lucene scoring. Lucene scoring is how about the result entry match your query. Page raking is more about how relevant is the web page. So for a document, the Lucene scoring depends on the query, and the page raking is quite absolute. Nicolas --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
-- "talk trash and carry a small stick." PAUL KRUGMAN (NYT)