Re: Cassandra client tuning

2018-03-18 Thread Ben Slater
“* 1000 statements in in each batch” sounds like you are doing batching in both cases. I wouldn't expect things to get better with larger sizes than that. We’ve generally found more like 100 is the sweet spot but I’m sure it’s data specific. On Sun, 18 Mar 2018 at 21:17 onmstester onmstester

Re: Cassandra client tuning

2018-03-18 Thread onmstester onmstester
I'm using a queue of 100 ExecuteAsyncs * 1000 statements in in each batch = 100K insert queue in non-batch scenario. Using more than 1000 statememnts per batch throws batch limit exception and some documents recommend no to change batch_size_limit??! Sent using Zoho Mail On Sun, 18

Re: Cassandra client tuning

2018-03-18 Thread Ben Slater
When you say batch was worth than async in terms of throughput are you comparing throughput with the same number of threads or something? I would have thought if you have much less CPU usage on the client with batching and your Cassandra cluster doesn’t sound terribly stressed then there is room

Re: Cassandra client tuning

2018-03-18 Thread onmstester onmstester
Input data does not preserve good locality and I've already tested batch insert, it was worse than executeAsync in case of throughput but much less CPU usage at client side. Sent using Zoho Mail On Sun, 18 Mar 2018 12:46:02 +0330 Ben Slater ben.sla...@instaclustr.com wrote

Re: Cassandra client tuning

2018-03-18 Thread Ben Slater
You will probably find grouping writes into small batches improves overall performance (if you are not doing it already). See the following presentation for some more info: https://www.slideshare.net/Instaclustr/microbatching-highperformance-writes Cheers Ben On Sun, 18 Mar 2018 at 19:23

Cassandra client tuning

2018-03-18 Thread onmstester onmstester
I need to insert some millions records in seconds in Cassandra. Using one client with asyncExecute with folllowing configs: maxConnectionsPerHost = 5 maxRequestsPerHost = 32K maxAsyncQueue at client side = 100K I could achieve 25% of throughtput i needed, client CPU is more than 80% and