In my opinion, triggers/stored procedures are an absolute requirement for any 
distributed database.

We've been using stored procedures in Cassandra now for a while, we've made 
modifications such that we don't really write directly anymore but pass 
everything through either a default stored procedures (which is just what was 
there before) or a dynamically loaded piece of java.

These stored procedures can call other dynamically loaded pieces of java as 
well - we don't have any plans to implement any scripting capabilities.  We can 
also 'select' from procedures.

The idea of downloading data from a distributed data base for processing flies 
in the face of what nosql and bigdata is all about - you've got to do it in the 
db.

On Apr 22, 2012, at 11:35 AM, Brian O'Neill wrote:

> Praveen,
> 
> We are certainly interested. To get things moving we implemented an add-on 
> for Cassandra to demonstrate the viability (using AOP):
> https://github.com/hmsonline/cassandra-triggers
> 
> Right now the implementation executes triggers asynchronously, allowing you 
> to implement a java interface and plugin your own java class that will get 
> called for every insert.
> 
> Per the discussion on 1311, we intend to extend our proof of concept to be 
> able to invoke scripts as well.  (minimally we'll enable javascript, but 
> we'll probably allow for ruby and groovy as well)
> 
> -brian
> 
> On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
> 
>> I found that Triggers are coming in Cassandra 1.2 
>> (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any 
>> StoreProc like pattern.
>> 
>> I know this has been discussed so many times but never met with any 
>> initiative. Even Groovy was staged out of the trunk.
>> 
>> Cassandra is great for logging and as such will be infinitely more useful if 
>> some logic can be pushed into the Cassandra cluster nearer to the location 
>> of Data to generate a materialized view useful for applications.
>> 
>> Server Side Scripts/Routines in Distributed Databases could soon prove to be 
>> the differentiating factor.
>> 
>> Let me reiterate things with a use case.
>> 
>> In our application we store time series data in wide rows with TTL set on 
>> each point to prevent data from growing beyond acceptable limits. Still the 
>> data size can be a limiting factor to move all of it from the cluster node 
>> to the querying node and then to the application via thrift for processing 
>> and presentation.
>> 
>> Ideally we should process the data on the residing node and pass only the 
>> materialized view of the data upstream. This should be trivial if Cassandra 
>> implements some sort of server side scripting and CQL semantics to call it.
>> 
>> Is anybody else interested in a similar feature? Is it being worked on? Are 
>> there any alternative strategies to this problem?
>> 
>> Praveen
>> 
>> 
> 
> -- 
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://weblogs.java.net/blog/boneill42/
> blog: http://brianoneill.blogspot.com/
> 

Reply via email to