You might also note that Solr 1.4 has Carrot2 integrated with it and
will eventually have Mahout support too. Carrot2 is often appropriate
for smaller clustering jobs, as it is an in memory model. See http://wiki.apache.org/solr/ClusteringComponent
. Also see the Carrot2 project at http://project.carrot2.org
-Grant
On Aug 22, 2009, at 4:52 PM, Sean Owen wrote:
The good news is that is very small volume. Lucene and Mahout operate,
broadly, in the realm of tens of millions of things or more. At this
scale I
think performance will not be an issue no matter what you choose, so
choose
based on your other requirements.
On Aug 22, 2009 9:18 PM, "Tim Hughes" <[email protected]> wrote:
We are looking to do a query of documents & abstracts from a legacy
system,
then retrieve the docs for clustering & classification via Mahout.
Expected
volume is something on the order of 2,000 - 3,000 documents.
Ted Dunning wrote: > > Can you say more about your application? > >
Mahout
is a very young proj...
--
View this message in context:
http://www.nabble.com/Custom-Algorithm-%28C-C%2B%2B%29---tp25096676p25097395.html
Sent from the Mahout User List mailing list archive at Nabble.com.
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search