The easiest way to do this is to create another Cassandra DC and point
Spark to it, since Spark can operate directly on data in Cassandra. No
impact on C* performance and no complex backup/restore process required,
just let Cassandra replicate the data for you.

If you need a scalable bulk export/import solution on your production data
that doesn't affect C* performance I would look at doing something like
this: https://www.youtube.com/watch?v=eY5oSZnwmJg . It's the best solution
I've seen to the problem.

On Wed, 19 Jul 2017 at 23:42 Fd Habash <fmhab...@gmail.com> wrote:

> I have a scenario where data has to be loaded into Spark nodes from two
> data stores: Oracle and Cassandra. We did the initial loading of data and
> found a way to do daily incremental loading from Oracle to Spark.
>
>
>
> I’m tying to figure our how to do this from C*. What tools are available
> in C* to do incremental backup/restore/load?
>
>
>
> Thanks
>
-- 


*Justin Cameron*Senior Software Engineer


<https://www.instaclustr.com/>


This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Reply via email to