I have a cassandra 2.0.6 cluster with four nodes as backup database. The
only operation is posting data into db. Recently, the full gc of the nodes
increases apparently and blocks cluster operation.
The load of each node is 10G. The heap is 8G each with default jvm memory
settings. The cpu is 24
The limitation is on the driver side. Try looking at
execute_concurrent_with_args in the cassandra.concurrent module to get
parallel writes with prepared statements.
https://datastax.github.io/python-driver/api/cassandra/concurrent.html
On Wed, Dec 30, 2015 at 11:34 PM Alexandre Beaulne <
To add to what Jonathan and Jack have said...
To get high levels of performance with the python driver you should:
- prepare your statements once (recent drivers default to Token Aware -
and will correctly apply it if the statement is prepared).
- execute asynchronously (up to ~150
Simplest option is to use java 8 with G1 gc.
> On 31 Dec 2015, at 10:23 a.m., Shuo Chen wrote:
>
> I have a cassandra 2.0.6 cluster with four nodes as backup database. The only
> operation is posting data into db. Recently, the full gc of the nodes
> increases
Make sure the driver is configured for token aware routing, otherwise the
coordinator node may have to redirect your write, adding a network hop.
To be absolutely clear, Cassandra uses the distributed, parallel model for
Big Data - lots of multi-threaded clients with lots of nodes. Clusters with
If you are lucky that might mask the real issue, but I doubt it… that is an
insane number of compaction tasks and indicative of another problem. I would
check release notes of 2.0.6+, if I recall that was not a stable version and
may have had leaks.
Aside from that, just FYI, if you use