[
https://issues.apache.org/jira/browse/CASSANDRA-9230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510976#comment-14510976
]
Robert Stupp commented on CASSANDRA-9230:
-----------------------------------------
I did some benchmarking. Each of the following variants ran in its own JVM with
embedded C* and Java driver connecting to 127.0.0.1. Only a single table and
only "re-prepares" since that's what happens in the description.
# prepare 10 statements iteratively (”Single” in the full result below)
# prepare 10 statements concurrently (”Concurrent")
# prepare 10 statements with new prepare-multi protocol request/response
(”Multi”)
Variants 1 + 2 are basically equal on a somewhat overloaded system (10
benchmark threads on Core i7 + threads created by C* and driver). Approx 1.2
operations per millisecond with runtime characteristics (GC, sync, safepoint).
Variant 3 is interesting since it can perform nearly 8 operations per
millisecond with less GC pressure (eden+survivor spaces) and less
sync/safepoint work.
References:
(Disclaimer: code + benchmark are dirty code)
* C* branch (based on trunk):
https://github.com/snazy/cassandra/tree/9230-multi-prepare
* Driver branch (based on 2.1):
https://github.com/snazy/java-driver/tree/C9230-multi-prepare
Full results (with JMH 1.9):
{code}
Benchmark Mode Cnt Score
Error Units
Concurrent thrpt 10 1,202 ±
0,102 ops/ms
Concurrent:·gc.alloc.rate thrpt 10 183,951 ±
53,938 MB/sec
Concurrent:·gc.alloc.rate.norm thrpt 10 256993,905 ±
11495,864 B/op
Concurrent:·gc.churn.PS_Eden_Space thrpt 10 212,334 ±
133,280 MB/sec
Concurrent:·gc.churn.PS_Eden_Space.norm thrpt 10 297916,339 ±
168588,842 B/op
Concurrent:·gc.churn.PS_Survivor_Space thrpt 10 1,045 ±
2,626 MB/sec
Concurrent:·gc.churn.PS_Survivor_Space.norm thrpt 10 1435,132 ±
3614,519 B/op
Concurrent:·gc.count thrpt 10 12,000
counts
Concurrent:·gc.time thrpt 10 104,000
ms
Concurrent:·rt.safepointSyncTime thrpt 10 0,094
ms
Concurrent:·rt.safepointTime thrpt 10 0,497
ms
Concurrent:·rt.safepoints thrpt 10 628,000
counts
Concurrent:·rt.sync.contendedLockAttempts thrpt 10 19344,000
locks
Concurrent:·rt.sync.fatMonitors thrpt 10 4096,000
monitors
Concurrent:·rt.sync.futileWakeups thrpt 10 283,000
counts
Concurrent:·rt.sync.monitorDeflations thrpt 10 2357,000
monitors
Concurrent:·rt.sync.monitorInflations thrpt 10 2361,000
monitors
Concurrent:·rt.sync.notifications thrpt 10 680,000
counts
Concurrent:·rt.sync.parks thrpt 10 10028,000
counts
Concurrent:·threads.alive thrpt 10 189,900 ±
10,040 threads
Concurrent:·threads.daemon thrpt 10 174,400 ±
2,869 threads
Concurrent:·threads.started thrpt 10 4575,000
threads
Multi thrpt 10 7,958 ±
0,727 ops/ms
Multi:·gc.alloc.rate thrpt 10 944,555 ±
262,203 MB/sec
Multi:·gc.alloc.rate.norm thrpt 10 200015,477 ±
7233,297 B/op
Multi:·gc.churn.PS_Eden_Space thrpt 10 1059,710 ±
269,304 MB/sec
Multi:·gc.churn.PS_Eden_Space.norm thrpt 10 226437,341 ±
27154,083 B/op
Multi:·gc.churn.PS_Survivor_Space thrpt 10 2,598 ±
0,675 MB/sec
Multi:·gc.churn.PS_Survivor_Space.norm thrpt 10 552,092 ±
48,209 B/op
Multi:·gc.count thrpt 10 54,000
counts
Multi:·gc.time thrpt 10 235,000
ms
Multi:·rt.safepointSyncTime thrpt 10 0,091
ms
Multi:·rt.safepointTime thrpt 10 0,698
ms
Multi:·rt.safepoints thrpt 10 687,000
counts
Multi:·rt.sync.contendedLockAttempts thrpt 10 8145,000
locks
Multi:·rt.sync.fatMonitors thrpt 10 2816,000
monitors
Multi:·rt.sync.futileWakeups thrpt 10 140,000
counts
Multi:·rt.sync.monitorDeflations thrpt 10 2600,000
monitors
Multi:·rt.sync.monitorInflations thrpt 10 2604,000
monitors
Multi:·rt.sync.notifications thrpt 10 743,000
counts
Multi:·rt.sync.parks thrpt 10 7567,000
counts
Multi:·threads.alive thrpt 10 120,800 ±
9,998 threads
Multi:·threads.daemon thrpt 10 105,300 ±
2,855 threads
Multi:·threads.started thrpt 10 4507,000
threads
Single thrpt 10 1,159 ±
0,115 ops/ms
Single:·gc.alloc.rate thrpt 10 181,800 ±
55,871 MB/sec
Single:·gc.alloc.rate.norm thrpt 10 262667,646 ±
13349,288 B/op
Single:·gc.churn.PS_Eden_Space thrpt 10 218,519 ±
138,057 MB/sec
Single:·gc.churn.PS_Eden_Space.norm thrpt 10 314999,297 ±
160492,547 B/op
Single:·gc.churn.PS_Survivor_Space thrpt 10 1,155 ±
3,894 MB/sec
Single:·gc.churn.PS_Survivor_Space.norm thrpt 10 1557,015 ±
5198,605 B/op
Single:·gc.count thrpt 10 12,000
counts
Single:·gc.time thrpt 10 106,000
ms
Single:·rt.safepointSyncTime thrpt 10 0,086
ms
Single:·rt.safepointTime thrpt 10 0,422
ms
Single:·rt.safepoints thrpt 10 594,000
counts
Single:·rt.sync.contendedLockAttempts thrpt 10 10148,000
locks
Single:·rt.sync.fatMonitors thrpt 10 2560,000
monitors
Single:·rt.sync.futileWakeups thrpt 10 106,000
counts
Single:·rt.sync.monitorDeflations thrpt 10 2221,000
monitors
Single:·rt.sync.monitorInflations thrpt 10 2224,000
monitors
Single:·rt.sync.notifications thrpt 10 684,000
counts
Single:·rt.sync.parks thrpt 10 8969,000
counts
Single:·threads.alive thrpt 10 118,000 ±
9,798 threads
Single:·threads.daemon thrpt 10 102,500 ±
3,207 threads
Single:·threads.started thrpt 10 4505,000
threads
{code}
> Allow preparing multiple prepared statements at once
> ----------------------------------------------------
>
> Key: CASSANDRA-9230
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9230
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Vishy Kasar
> Priority: Minor
> Labels: ponies
>
> We have a few cases like this:
> 1. Large (40K) clients
> 2. Each client preparing the same 10 prepared statements at the start up and
> on reconnection to node
> 3. Small(ish) number (24) of cassandra nodes
> The statement need to be prepared on a casasndra node just once but currently
> it is prepared 40K times at startup.
> https://issues.apache.org/jira/browse/CASSANDRA-8831 will make the situation
> much better. A further optimization is to allow clients to create not yet
> prepared statements in bulk.This way, client can prepare all the not yet
> statements with one round trip to server.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)