You might also note that Solr 1.4 has Carrot2 integrated with it and will eventually have Mahout support too. Carrot2 is often appropriate for smaller clustering jobs, as it is an in memory model. See http://wiki.apache.org/solr/ClusteringComponent . Also see the Carrot2 project at http://project.carrot2.org

-Grant

On Aug 22, 2009, at 4:52 PM, Sean Owen wrote:

The good news is that is very small volume. Lucene and Mahout operate,
broadly, in the realm of tens of millions of things or more. At this scale I think performance will not be an issue no matter what you choose, so choose
based on your other requirements.

On Aug 22, 2009 9:18 PM, "Tim Hughes" <[email protected]> wrote:


We are looking to do a query of documents & abstracts from a legacy system, then retrieve the docs for clustering & classification via Mahout. Expected
volume is something on the order of 2,000 - 3,000 documents.

Ted Dunning wrote: > > Can you say more about your application? > > Mahout
is a very young proj...
--
View this message in context:
http://www.nabble.com/Custom-Algorithm-%28C-C%2B%2B%29---tp25096676p25097395.html

Sent from the Mahout User List mailing list archive at Nabble.com.

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to