David, Bruce, Otis,
Thank you all for the quick replies. I looked through the BooksLikeThis
example. I also agree, it's a very good and effective way to find
similar docs in the index. Nevertheless, what I need is really a
similarity matrix holding all TF*IDF values. For illustration I quick
and
Christoph Kiefer wrote:
David, Bruce, Otis,
Thank you all for the quick replies. I looked through the BooksLikeThis
example. I also agree, it's a very good and effective way to find
similar docs in the index. Nevertheless, what I need is really a
similarity matrix holding all TF*IDF values. For
-
From: Christoph Kiefer [mailto:[EMAIL PROTECTED]
Sent: December 14, 2004 11:45 AM
To: Lucene Users List
Subject: TFIDF Implementation
Hi,
My current task/problem is the following: I need to implement
TFIDF document term ranking using Jakarta Lucene to compute a
similarity rank between arbitrary
://www.jivesoftware.com/
-Original Message-
From: Christoph Kiefer [mailto:[EMAIL PROTECTED]
Sent: December 14, 2004 11:45 AM
To: Lucene Users List
Subject: TFIDF Implementation
Hi,
My current task/problem is the following: I need to implement
TFIDF document term ranking using Jakarta Lucene
[mailto:[EMAIL PROTECTED]
Sent: December 14, 2004 11:45 AM
To: Lucene Users List
Subject: TFIDF Implementation
Hi,
My current task/problem is the following: I need to implement
TFIDF document term ranking using Jakarta Lucene to compute a
similarity rank between arbitrary documents
You can also see 'Books like this' example from here
https://secure.manning.com/catalog/view.php?book=hatcher2item=source
Well done, uses a term vector, instead of reparsing the orig
doc, to form the similarity query. Also I like the way you
exclude the source doc in the query, I
From the code I looked at, those calls don't recalculate on
every call.
I was referring to this fragment below from BooksLikeThis.docsLike(),
and was mentioning it as the javadoc
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/in
dex/TermFreqVector.html
does not say that
Bruce Ritchie wrote:
From the code I looked at, those calls don't recalculate on
every call.
I was referring to this fragment below from BooksLikeThis.docsLike(),
and was mentioning it as the javadoc
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/in
dex/TermFreqVector.html
does
Bruce Ritchie wrote:
You can also see 'Books like this' example from here
https://secure.manning.com/catalog/view.php?book=hatcher2item=source
Well done, uses a term vector, instead of reparsing the orig
doc, to form the similarity query. Also I like the way you
exclude the source doc in
.
Regards,
Bruce Ritchie
http://www.jivesoftware.com/
-Original Message-
From: Christoph Kiefer [mailto:[EMAIL PROTECTED]
Sent: December 14, 2004 11:45 AM
To: Lucene Users List
Subject: TFIDF Implementation
Hi,
My current task/problem is the following: I need to implement
Hi,
My current task/problem is the following: I need to implement TFIDF
document term ranking using Jakarta Lucene to compute a similarity rank
between arbitrary documents in the constructed index.
I saw from the API that there are similar functions already implemented
in the class Similarity and
11 matches
Mail list logo