Re: How can I efficiently export the content of my table to KAFKA

2017-04-28 Thread Tobias Eriksson
Hi Chris, Well, that seemed like a good idea at first, I would like to read from Cassandra and post to KAFKA But the KAFKA Connector Cassandra Source, requires that the table has a time-series order, and all my tables does not So thanx for the tip, but it did not work ☹ -Tobias From: Chris

Re: How can I efficiently export the content of my table to KAFKA

2017-04-27 Thread Chris Stromberger
Maybe https://www.confluent.io/blog/kafka-connect-cassandra-sink-the-perfect-match/ On Wed, Apr 26, 2017 at 2:49 PM, Tobias Eriksson < tobias.eriks...@qvantel.com> wrote: > Hi > > I would like to make a dump of the database, in JSON format, to KAFKA > > The database contains lots of data,

Re: How can I efficiently export the content of my table to KAFKA

2017-04-26 Thread Justin Cameron
You can run multiple applications in parallel in Standalone mode - you just need to configure spark to allocate resources between your jobs the way you want (by default it assigns all resources to the first application you run, so they won't be freed up until it has finished). You can use Spark's

Re: How can I efficiently export the content of my table to KAFKA

2017-04-26 Thread Tobias Eriksson
Well, I have been working some with Spark and the biggest hurdle is that Spark does not allow me to run multiple jobs in parallel i.e. at the point of starting the job to taking the table of “Individuals” I will have to wait until all that processing is done before I can start an additional one

Re: How can I efficiently export the content of my table to KAFKA

2017-04-26 Thread Justin Cameron
You could probably save yourself a lot of hassle by just writing a Spark job that scans through the entire table, converts each row to JSON and dumps the output into a Kafka topic. It should be fairly straightforward to implement. Spark will manage the partitioning of "Producer" processes for you