insert performance (1.2.8)

Keith Freeman Mon, 19 Aug 2013 15:50:36 -0700

I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghzmachines not shared with any other VMs). I'm inserting time-series datainto a single column-family using "wide rows" (timeuuids) and have a3-part partition key so my primary key is something like ((a, b, day),in-time-uuid), x, y, z).

My java client is feeding rows (about 1k of raw data size each) inbatches using multiple threads, and the fastest I can get it runreliably is about 2000 rows/second. Even at that speed, all 3 cassandranodes are very CPU bound, with loads of 6-9 each (and the client machineis hardly breaking a sweat). I've tried turning off compression in mytable which reduced the loads slightly but not much. There are no otherupdates or reads occurring, except the datastax opscenter.

I was expecting to be able to insert at least 10k rows/second with thisconfiguration, and after a lot of reading of docs, blogs, and google,can't really figure out what's slowing my client down. When I increasethe insert speed of my client beyond 2000/second, the server responsesare just too slow and the client falls behind. I had a single-nodeMysql database that can handle 10k of these data rows/second, so Ireally feel like I'm missing something in Cassandra. Any ideas?

insert performance (1.2.8)

Reply via email to