You could use Term Vectors (TVs) to do this, but I don't know of any
existing code for it. Might be a good contrib module, though.
Search this list or see Lucene In Action or I have some TV sample
code at http://www.cnlp.org/apachecon2005/
You might also check the Carrot2 project, which has a number of
clustering algorithms and some Lucene support, although I don't know
if it does specifically what you want.
On Apr 2, 2007, at 10:14 PM, Lokeya wrote:
Hi All,
I have queried and have got a HITS object which is a collection of
documents. I want to find out the centroid of these documents.
Centroid =
Top Most 35(for eg)common terms across all the documents in the HITS
object.
Is there any API in Lucene for this?
Thanks in Advance.
--
View this message in context: http://www.nabble.com/How-to-
calculate-centroid-from-HITS--tf3509432.html#a9802563
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org
Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]