[
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265947#comment-14265947
]
Robert Stupp commented on CASSANDRA-7438:
-----------------------------------------
The latest (just checked in) benchmark implementation gives much better
results. Using
{{com.codahale.metrics.Timer#time(java.util.concurrent.Callable<T>)}}
eliminates use of {{System.nanoTime()}} or
{{ThreadMXBean.getCurrentThreadCpuTime()}} - it can directly use its internal
clock.
The benchmark {{java -jar ohc-benchmark/target/ohc-benchmark-0.2-SNAPSHOT.jar
-rkd 'gaussian(1..20000000,2)' -wkd 'gaussian(1..20000000,2)' -vs
'gaussian(1024..4096,2)' -r .9 -cap 1600000000 -d 30 -t 30}} improved from 800k
reads to 3.3M reads per second w/ 8 cores). So yes - benchmark was measuring
its own mad code. Due to that I edited my previous comment with the benchmark
results since those are invalid now.
I've added a (yet simple) JMH benchmark as a separate module. This one can
cause high system CPU usage - at operation rates of 2M per second or more (8
cores). I think these rates are really fine.
Note: these rates cannot be achieved in production since then you'll obviously
have to pay for (de)serialization, too.
So we want to address these topics as follow-up:
* own off-heap allocator
* C* ability to access off-heap cached rows
* C* ability to serialize hot keys directly from off-heap (might be a minor win
since it's triggered not that often)
* per-table knob to control whether to add to row-cache on writes -- I strongly
believe that this is a useful feature (maybe LHF) on workloads where read and
written data work on different (row} keys.
* investigate if counter-cache can benefit
* investigate if key-cache can benefit
bq. You could start with it outside and publish to maven central and if there
an issue getting patches applied quickly we can always fork it in C*.
OK
bq. pluggable row cache
Then I'll start with that - just make row-cache pluggable and the
implementation configurable.
Note: JNA has a synchronized block that's executed at every call - version
4.2.0 fixes this (don't know when it will be released).
> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Linux
> Reporter: Vijay
> Assignee: Vijay
> Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in
> JVM heap as BB,
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off
> heap and use JNI to interact with cache. We might want to ensure that the new
> implementation match the existing API's (ICache), and the implementation
> needs to have safe memory access, low overhead in memory and less memcpy's
> (As much as possible).
> We might also want to make this cache configurable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)