Re: Not what I‘ve expected Performance

2018-02-01 Thread Jürgen Albersdorfer
I changed it a little to spark.sql and extracted such a partitioning key table as You did with the userid and joined this to my table to copy and safe this to cassandra seemed in a First Test to utilize every given Bit of Performance the Cluster can provide. Dont yet know why the first code did

Re: Not what I‘ve expected Performance

2018-02-01 Thread kurt greaves
That extra code is not necessary, it's just to only retrieve a sampling of let's. You don't want it if you're copying the whole table. It sounds like you're taking the right approach, probably just need some more tuning. Might be on the Cassandra side as well (concurrent_reads/writes). On 1 Feb.

Re: Not what I‘ve expected Performance

2018-02-01 Thread Jürgen Albersdorfer
Hi Kurt, thanks for your response. I indeed utilized Spark - what I've forgot to mention - and I did it nearly the same as in the example you gave me. Just without that .select(PK).sample(false, 0.1) Instruction which I don't actually get what it's useful for - and maybe that's the key to the castl

Re: Not what I‘ve expected Performance

2018-01-31 Thread kurt greaves
How are you copying? With CQLSH COPY or your own script? If you've got spark already it's quite simple to copy between tables and it should be pretty much as fast as you can get it. (you may even need to throttle). There's some sample code here (albeit it's copying between clusters but easily tailo

Not what I‘ve expected Performance

2018-01-30 Thread Jürgen Albersdorfer
Hi, We are using C* 3.11.1 with a 9 Node Cluster built on CentOS Servers eac= h having 2x Quad Core Xeon, 128GB of RAM and two separate 2TB spinning Disks= , one for Log one for Data with Spark on Top. Due to bad Schema (Partitions of about 4 to 8 GB) I need to copy a whole Tab= le into another ha