Hello,
Lucene gives the best documents for a given query and PageRank uses
citation
analysis with similar results but requires a large set of metadata to
complete.
Scoring in Lucene delivers pure search while PageRank attempts to
establish source authority.
I''m not strong in math, those who are can find an explanation of the
latter here:
http://www.ams.org/featurecolumn/archive/pagerank.html
Regards,
Peter W.
On Jan 22, 2007, at 12:00 PM, Mark Miller wrote:
Well first Lucene checks all of the other documents in the world
for any that that refer to the document that your adding to
Lucene...and then...oh wait...
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/
Similarity.html
EDMOND KEMOKAI wrote:
Hmm..doesn't lucene scoring determine how relevant a document is
to your
query? That is what PageRank and HITS do as well, I believe. Page and
document are the same, if you want to index a page you'll
obviously try to
convert it into a document. PageRank does link analysis to
determine how
relevant that page is as it relates to the query you entered, does
lucene
have something similar? How does lucene determine between two
documents
which one should score higher if they both contain a certain term?
Google
uses PageRank to make that determination, how does lucene do it?
On 1/22/07, Nicolas Lalevée <[EMAIL PROTECTED]> wrote:
Le Lundi 22 Janvier 2007 19:33, EDMOND KEMOKAI a écrit:
> Hi All
> This is a question for those familiar with lucene document
scoring. How
> does it compare with googles PageRank or HITS, or are they very
different?
> I have being looking at the PageRank algorithm but I'll need to
brush-off
> my math skills before delving into it:)
In fact Lucene is just a search engine. Then you can use the
search engine
to
search in web pages, like Nutch is using Lucene. And Google is
more like
Nutch : a web crawler plus a web-search engine. So when you are
taking
about
page raking, it has nothing to do with Lucene scoring. Lucene
scoring is
how
about the result entry match your query. Page raking is more
about how
relevant is the web page. So for a document, the Lucene scoring
depends on
the query, and the page raking is quite absolute.
Nicolas
--------------------------------------------------------------------
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]