We are looking to do a query of documents & abstracts from a legacy system, then retrieve the docs for clustering & classification via Mahout. Expected volume is something on the order of 2,000 - 3,000 documents.
Ted Dunning wrote: > > Can you say more about your application? > > Mahout is a very young project and is known to be sub-standard in a number > of respects due to youth. Depending on what you need, it might be > excellent, or seriously deficient (at the moment). The deficiencies will > be > addressed over time, but full disclosure now is important. > > Depending on what you need, an on-line learning system like vowpal might > be > much better for you. > > On Sat, Aug 22, 2009 at 12:59 PM, Tim Hughes <[email protected]> > wrote: > >> We're looking specifically at Mahout (on top of the other supporting >> Apache >> projects). One of the roadblocks to moving in that direction is the >> concern >> about Java performance. We could not go the Mahout direction if there was >> no >> way to use C/C++; since there is, we can bypass the "premature >> optimization" >> and run Mahout as designed, yet have the ability to fall back to custom C >> code if the user's expectations are not met. >> > > > > -- > Ted Dunning, CTO > DeepDyve > > -- View this message in context: http://www.nabble.com/Custom-Algorithm-%28C-C%2B%2B%29---tp25096676p25097395.html Sent from the Mahout User List mailing list archive at Nabble.com.
