Re: HELP with bulk loading

2017-03-14 Thread Artur R
Thank you all! It turns out that the fastest ways are: https://github.com/brianmhess/ cassandra-loader and COPY FROM. So I decided to stick with COPY FROM as it built-in and easy-to-use. On Fri, Mar 10, 2017 at 2:22 PM, Ahmed Eljami wrote: > Hi, > > >3. sstableloader is

Re: HELP with bulk loading

2017-03-10 Thread Ahmed Eljami
Hi, >3. sstableloader is slow too. Assuming that I have new empty C* cluster, how can I improve the upload speed? Maybe disable replication or some other settings while streaming and then turn it back? Maybe you can accelerate you load with the option -cph (connection per host):

Re: HELP with bulk loading

2017-03-09 Thread Stefania Alborghetti
When I tested cqlsh COPY FROM for CASSANDRA-11053 , I was able to import about 20 GB in under 4 minutes on a cluster with 8 nodes using

Re: HELP with bulk loading

2017-03-09 Thread Ryan Svihla
I suggest using cassandra loader https://github.com/brianmhess/cassandra-loader On Mar 9, 2017 5:30 PM, "Artur R" wrote: > Hello all! > > There are ~500gb of CSV files and I am trying to find the way how to > upload them to C* table (new empty C* cluster of 3 nodes,

HELP with bulk loading

2017-03-09 Thread Artur R
Hello all! There are ~500gb of CSV files and I am trying to find the way how to upload them to C* table (new empty C* cluster of 3 nodes, replication factor 2) within reasonable time (say, 10 hours using 3-4 instance of c3.8xlarge EC2 nodes). My first impulse was to use CQLSSTableWriter, but it