Hi Rahul,
         Thanks for your answer. Why do you say that deleting from spark is not 
elegant?? This is the exact feedback I want. Basically why is it not elegant?
I can either delete using delete prepared statements or through spark. TTL 
approach doesn’t work for us
Because first of all ttl is there at a column level and there are business 
rules for purge which make the TTL solution not very clean in our case.

Thanks,
Charu

From: Rahul Singh <rahul.xavier.si...@gmail.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Thursday, March 22, 2018 at 5:08 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>, 
"user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Using Spark to delete from Transactional Cluster

Short answer : it works. You can even run “delete” statements from within Spark 
once you know which keys to delete. Not elegant but it works.

It will create a bunch of tombstones and you may need to spread your deletes 
over days. Another thing to consider is instead of deleting setting a TTL which 
will eventually get cleansed.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Mar 22, 2018, 2:19 PM -0500, Charulata Sharma (charshar) 
<chars...@cisco.com>, wrote:

Hi,
   Wanted to know the community’s experiences and feedback on using Apache 
Spark to delete data from C* transactional cluster.
We have spark installed in our analytical C* cluster and so far we have been 
using Spark only for analytics purposes.

However, now with advanced features of Spark 2.0, I am considering using 
spark-cassandra connector for deletes instead of a series of Delete Prepared 
Statements
So essentially the deletes will happen on the analytical cluster and they will 
be replicated over to transactional cluster by means of our keyspace 
replication strategies.

Are there any risks involved in this ??

Thanks,
Charu

Reply via email to