If you want distributed machine learning, you can use either Mahout (runs on
Hadoop) or Spark (MLLib). If you choose the Hadoop route, Datastax provides a
connector (CFS) to interact with data stored in Cassandra. Otherwise you can
try to use the Cassandra InputFormat (not as simple, but plenty
I’ve supported a variety of different “big data” systems and most have their
own particular set of use cases that make sense. Having said that, I believe
that Cassandra uniquely excels at the following:
* Low write latency with respect to small to medium write sizes (logs, sensor
data, etc.)
*
If you're interested and/or need some Cassandra docker images let me know I'll
shoot you a link.
James
Sent from my iPhone
On May 21, 2014, at 10:19 AM, Jabbar Azam aja...@gmail.com wrote:
That sounds interesting. I was thinking of using coreos with docker
containers for the business
to add and remove the nodes from the cluster and also the node
cleanup.
Disclaimer: this is not a production system but something Im experimenting
with in my own time.
Thanks
Jabbar Azam
On 21 May 2014 15:51, James Horey j...@opencore.io wrote:
If you're interested and/or need some
If you’re running unit tests and repeatadly clearing the Cassandra keyspaces,
you may want to check out Ferry (ferry.opencore.io). It lets you
standup/destroy multiple Cassandra stacks locally on your machine and is useful
for the use case you described. I’m the author of Ferry, and would be
Hello all,
I’m trying to collect and organize Cassandra applications for educational
purposes. I’m hoping that by collating these applications in a single place,
new users will be able to get up to speed a bit easier. If you know of a great
application (should be open-source and preferably up