I haven't used Brisk, but doubt there would be any real difference for your use case. Hadoop's integration with Cassandra is fairly light and arms-length, so the details of the Cassandra distro behind it ought not matter too much.
Cassandra isn't the most natural choice as a Hadoop data store, but it can be made to work. If you're choosing tools from scratch, just using HDFS as your data store is probably more natural. Mahout has virtually no direct relationship to Cassandra, so the same comments apply even more as regards Brisk vs vanilla Cassandra and Mahout. It should not matter much if at all. The only direct integration with Cassandra is in the non-distributed Recommender, where I cobbled together a CassandraDataModel for an article I wrote (http://www.acunu.com/blogs/sean-owen/recommending-cassandra/). Mahout doesn't use Cassandra directly when using Hadoop, but you can modify it to work with Cassandra as an InputFormat. Again, conveniently, I wrote about that recently for Acunu: http://www.acunu.com/blogs/sean-owen/scaling-cassandra-and-mahout-hadoop/ Sean On Sat, Nov 26, 2011 at 5:13 PM, Tan Shern Shiou <[email protected]> wrote: > Hello, > > I am planning to use Mahout with Hadoop and Cassandra as datastore. I have > been reading about the goodness of using Brisk. Can we use Mahout with Brisk > as is from the package like how we implement it with Hadoop and Cassandra? > > Whats the difference of using Brisk and a combination of Hadoop and > Cassandra? > > thanks. >
