To batch or not to batch: A question for fast inserts

Gerard Maas Tue, 22 Sep 2015 07:56:16 -0700

General advice advocates for individual async inserts as the fastest way to
insert data into Cassandra. Our insertion mechanism is based on that model
and recently we have been evaluating performance, looking to measure and
optimize our ingestion rate.


I side-tracked some punctual benchmarks and stumbled on the observations of
unlogged inserts being *A LOT* faster than the async counterparts.

In our tests, unlogged batch shows increased throughput and lower cluster
CPU usage, so I'm wondering where the tradeoff might be.

I compiled those observations in this document that I'm sharing and opening
up for comments.  Are we observing some artifact or should we set the
record straight for unlogged batches to achieve better insertion throughput?

https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI

Let me know.

Kind regards,

Gerard.

To batch or not to batch: A question for fast inserts

Reply via email to