[ 
https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150253#comment-15150253
 ] 

Stefania commented on CASSANDRA-11053:
--------------------------------------

Here are the latest results:

||MODULE CYTHONIZED||PREPARED STATEMENTS||NUM. WORKER PROCESSES||CHUNK 
SIZE||AVERAGE ROWS / SEC||TOTAL TIME||
|DRIVER|YES|7|5,000|97,146|3' 31"|
|DRIVER|YES|8|5,000|103,037|3' 19"|
|DRIVER|YES|9|5,000|104,070|3' 17"|
|DRIVER|YES|10|5,000|*104,498*|3' 16"|
|DRIVER COPYUTIL|YES|7|5,000|89,123|3' 48"|
|DRIVER COPYUTIL|YES|8|5,000|107,897|3' 10"|
|DRIVER COPYUTIL|YES|9|5,000|*109,871*|3' 7"|
|DRIVER COPYUTIL|YES|10|5,000|109,616|3' 8"|

In addition to using separate pipes as mentioned above, I've found one more 
optimization and I've calibrated how much data the parent process sends to the 
worker processes. Two default parameters have changed: the max ingest rate is 
now 150k and the report frequency has changed from 4 times per second to 2. 
I've run cqlsh with {{SCHED_BATCH}} CPU scheduling ({{schedtool -B -e 
./bin/cqlsh}}) (it helps a little bit, maybe 2-3k rows/second) and I've changed 
the clock source from {{xen}} to {{tlc}} (unsure if this helps but it doesn't 
hurt).

I would like to repeat the tests on an AWS instance with twice the number of 
cores, to see how much we can scale. I've already verified that if we half the 
number of cores (by fixing the affinity to only 4 cores) then the throughput 
also halves. I'm thinking of testing on C4.4xlarge. So far I've used R3.2xlarge 
but we don't need all that memory and so I would like to try a C4 instance 
instead. 

> COPY FROM on large datasets: fix progress report and debug performance
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-11053
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11053
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>         Attachments: copy_from_large_benchmark.txt, 
> copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, 
> worker_profiles.txt, worker_profiles_2.txt
>
>
> Running COPY from on a large dataset (20G divided in 20M records) revealed 
> two issues:
> * The progress report is incorrect, it is very slow until almost the end of 
> the test at which point it catches up extremely quickly.
> * The performance in rows per second is similar to running smaller tests with 
> a smaller cluster locally (approx 35,000 rows per second). As a comparison, 
> cassandra-stress manages 50,000 rows per second under the same set-up, 
> therefore resulting 1.5 times faster. 
> See attached file _copy_from_large_benchmark.txt_ for the benchmark details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to