Storm can make a great execution container for Streams. W2O has used Storm Trident for many use cases in Apache Streams roadmap and we are interested in developing this aspect of streams.
We currently use Kafka as our primary queuing and mid-stream persistence mechanism. It works fine, when in general all messages follow a linear progression from source to sink through a stream. We haven¹t needed complex forking conditions or multi-module pub/sub yet - in those cases Cassandra can support queries which will ease implementation vs Kafka. As we look at client libraries for Cassandra, here¹s another contender. Intravert runs alongside C* on each node, exposes a very simple rest API, and lets the client specify script-like conditions that get evaluated within the C* cluster as result sets are prepared. I have a lot of respect for the Datastax team, but I think CQL has gone somewhat off the rails and CQL 3 isn¹t the best way to take advantage of what Cassandra is good at. https://github.com/zznate/intravert-ug Steve Blackmon Director, Data Sciences W2O Group 101 W. 6th Street Suite 330 Austin, Texas 78701 cell 512.965.0451 | work 512.402.6366 twitter @steveblackmon On 8/27/13, 11:19 AM, "Danny Sullivan" <[email protected]> wrote: >The Storm Project, http://storm-project.net/, would add processing of >ActivityStreamsEntry objects to Apache Streams (make it faster). Storm >has integration support for Cassandra. There is an available Java Driver >to hook up Storm and Cassandra available here: >https://github.com/ptgoetz/storm-cassandra. While I think it is a good >idea to add Storm to Streams, Cassandra recommends using CQL3 as an >interface from applications to the database which moves away from older >thrift clients: http://wiki.apache.org/cassandra/ClientOptions. These >older thrift clients include Astyanax and Hector (I have had experience >incorporating both into the project, with varied success). The newer >client which relies on CQL3 support is the Datastax Java Driver: >https://github.com/datastax/java-driver. I've looked into the Datastax >driver and have been very pleased with the CQL support. However, it is >not very far along in the development process and doesn't have an object >mapper and I'm cautious adopting a product so early in its development. >If we were to integrate Storm and Streams would we want to use the >storm-cassandra driver or should we look into adding the Datastax driver >which has better CQL support? >Danny CONFIDENTIALITY NOTICE: This e-mail, along with any documents, files, or attachments, may contain information that is confidential, privileged, or otherwise exempt from disclosure. If you are not the intended recipient or person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, copying, printing, distribution or use of any information contained in or attached to this e-mail is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and delete the original e-mail and its attachments without reading, printing, or saving in any manner. This e-mail message should not be interpreted to include a digital or electronic signature that can be used to authenticate an agreement, contract or other legal document, nor to reflect an intention to be bound to any legally-binding agreement or contract. Your cooperation is appreciated. Thank you.
