[
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262487#comment-14262487
]
Ariel Weisberg commented on CASSANDRA-7438:
-------------------------------------------
bq. Whether to migrate whole OHC code into org.apache.cassandra codebase (with
the option to either turn it on or off).
I am open to either. I asked Benedict and he prefers having it inside C* so we
can patch it. The advantage of having it outside is that it might see use
elsewhere and get additional eyes/contributions. You could start with it
outside and publish to maven central and if there an issue getting patches
applied quickly we can always fork it in C*.
bq. Whether to implement a “pluggable row cache“ (to allow multiple
implementations)
I think that we aren't going to need multiple cache implementations in the long
run. Seems like we should be able to have on that can be configured to have the
desired behavior. Benedict doesn't feel strongly about it either. If Vijay
wants to continue working on another implementation then we would want to keep
it pluggable the way it currently is.
It looks like the KeyCache and CounterCache both use a different implementation
and not SerializingCache. I am not clear on why they don’t use serializing
cache. It's worth evaluating why that is before converging on a single
implementation.
bq. New per-table knob to enable whether to populate entries to the row cache
on reads+writes or just on reads (to target different workloads)
Sounds like it would be useful, but first we have to come up with someone
somewhere that says I want this, or a workload where this is the right call.
There may also be correctness issues to think about see next item.
bq. Rethink about whether to keep the current RowCacheSentinel implementation
as is - if I understand it correctly, it just reduces the number of cache-put
operations (cache hit on a sentinel performs a disk read). A compromise
regarding additional serialization cost?
I think it is for correctness?
https://issues.apache.org/jira/browse/CASSANDRA-3862
I'm still reading up on this.
bq. Improvement of key (de)serialization (saving the row cache to disk) - use
direct I/O
There is some trickiness here because the AutoSavingCache breaks apart the keys
to determine where the data goes.
bq. Optimizations of value deserialization effort - let C* directly access a
cached row in off-heap memory instead of the deserialization (and on-heap
object construction) overhead.
I think these two together would make a good follow up ticket. Another good
follow up ticket would be addressing the allocator for performance and for
fragmentation.
> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Linux
> Reporter: Vijay
> Assignee: Vijay
> Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in
> JVM heap as BB,
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off
> heap and use JNI to interact with cache. We might want to ensure that the new
> implementation match the existing API's (ICache), and the implementation
> needs to have safe memory access, low overhead in memory and less memcpy's
> (As much as possible).
> We might also want to make this cache configurable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)