Have you tried using Carrot2 with Lucene?  They work quite well in tandem!

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: Supheakmungkol SARIN <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Wednesday, May 14, 2008 11:23:45 PM
> Subject: Document clustering with Lucene
> 
> Dear all,
> 
> I'd like to do document clustering using full-text with Lucene. In other 
> words, 
> I would like to group similar documents in their respective groups. I 
> searched 
> the mailing list and found that there are two ways around. The first method 
> is 
> to represent the one document as query and search the collection.  The other 
> way 
> would be to construct the vector of terms of each of the documents and use 
> the 
> cosine distance function to compute the similarity. I found these methods 
> here:
> 
> - http://www.mail-archive.com/[EMAIL PROTECTED]/msg04916.html). 
> 
> I would like to know whether there are better way? or any built-in functions 
> to 
> do clustering in the recent release version of Lucene?
> 
> Thank you.
> 
> Kind regards,
> 
> Supheakmungkol


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to