Have you tried using Carrot2 with Lucene? They work quite well in tandem! Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ---- > From: Supheakmungkol SARIN <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Wednesday, May 14, 2008 11:23:45 PM > Subject: Document clustering with Lucene > > Dear all, > > I'd like to do document clustering using full-text with Lucene. In other > words, > I would like to group similar documents in their respective groups. I > searched > the mailing list and found that there are two ways around. The first method > is > to represent the one document as query and search the collection. The other > way > would be to construct the vector of terms of each of the documents and use > the > cosine distance function to compute the similarity. I found these methods > here: > > - http://www.mail-archive.com/[EMAIL PROTECTED]/msg04916.html). > > I would like to know whether there are better way? or any built-in functions > to > do clustering in the recent release version of Lucene? > > Thank you. > > Kind regards, > > Supheakmungkol --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]