Dear all,

I'd like to do document clustering using full-text with Lucene. In other words, 
I would like to group similar documents in their respective groups. I searched 
the mailing list and found that there are two ways around. The first method is 
to represent the one document as query and search the collection.  The other 
way would be to construct the vector of terms of each of the documents and use 
the cosine distance function to compute the similarity. I found these methods 
here:

- http://www.mail-archive.com/[EMAIL PROTECTED]/msg04916.html). 

I would like to know whether there are better way? or any built-in functions to 
do clustering in the recent release version of Lucene?

Thank you.

Kind regards,

Supheakmungkol

Reply via email to