You could use Term Vectors (TVs) to do this, but I don't know of any existing code for it. Might be a good contrib module, though. Search this list or see Lucene In Action or I have some TV sample code at http://www.cnlp.org/apachecon2005/

You might also check the Carrot2 project, which has a number of clustering algorithms and some Lucene support, although I don't know if it does specifically what you want.

On Apr 2, 2007, at 10:14 PM, Lokeya wrote:


Hi All,

I have queried and have got a HITS object which is a collection of
documents. I want to find out the centroid of these documents. Centroid =
Top Most 35(for eg)common  terms across all the documents in the HITS
object.

Is there any API in Lucene for this?

Thanks in Advance.
--
View this message in context: http://www.nabble.com/How-to- calculate-centroid-from-HITS--tf3509432.html#a9802563
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to