On Nov 2, 2011, at 7:17 AM, Tharindu Mathew wrote: > I want to create a java UI tool (based on a web app) that can pick and > apply different algorithms available in Mahout to different data sets.
Very cool! Keep us posted, as this would be immensely useful! Any chance it will be donated back? :-) > > Hence the embedding with java. Obviously, I understand that everything is > callable from Java since it's written in Java :). > > For example, I want to do a apply a classification (ex: Bayesian) algorithm, > and train on a data set stored in Cassandra. I don't expect a sample for > Cassandra but at least a code sample that operates on a data set stored csv > file that applies an algorithm like Bayesian. > > I'd appreciate if you can point me to any code sample for this or something > similar? As others have said, MahoutDriver is a common entry point and can run pretty much anything in Mahout that has a main(). You might also look in $MAHOUT_HOME/examples/bin at the various shell scripts we've put together that run different examples. build-reuters, classify-20newsgroups and build-asf-email (all from trunk) demonstrate a fair amount of classification and clustering algorithms. Finally, Unit tests are your friend. -Grant
