[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262487#comment-14262487
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
-------------------------------------------

bq. Whether to migrate whole OHC code into org.apache.cassandra codebase (with 
the option to either turn it on or off).
I am open to either. I asked Benedict and he prefers having it inside C* so we 
can patch it. The advantage of having it outside is that it might see use 
elsewhere and get additional eyes/contributions. You could start with it 
outside and publish to maven central and if there an issue getting patches 
applied quickly we can always fork it in C*.

bq. Whether to implement a “pluggable row cache“ (to allow multiple 
implementations)
I think that we aren't going to need multiple cache implementations in the long 
run. Seems like we should be able to have on that can be configured to have the 
desired behavior. Benedict doesn't feel strongly about it either. If Vijay 
wants to continue working on another implementation then we would want to keep 
it pluggable the way it currently is.

It looks like the KeyCache and CounterCache both use a different implementation 
and not SerializingCache. I am not clear on why they don’t use serializing 
cache. It's worth evaluating why that is before converging on a single 
implementation.

bq. New per-table knob to enable whether to populate entries to the row cache 
on reads+writes or just on reads (to target different workloads)
Sounds like it would be useful, but first we have to come up with someone 
somewhere that says I want this, or a workload where this is the right call. 
There may also be correctness issues to think about see next item.

bq. Rethink about whether to keep the current RowCacheSentinel implementation 
as is - if I understand it correctly, it just reduces the number of cache-put 
operations (cache hit on a sentinel performs a disk read). A compromise 
regarding additional serialization cost?
I think it is for correctness? 
https://issues.apache.org/jira/browse/CASSANDRA-3862
I'm still reading up on this.

bq. Improvement of key (de)serialization (saving the row cache to disk) - use 
direct I/O
There is some trickiness here because the AutoSavingCache breaks apart the keys 
to determine where the data goes.
bq. Optimizations of value deserialization effort - let C* directly access a 
cached row in off-heap memory instead of the deserialization (and on-heap 
object construction) overhead.
I think these two together would make a good follow up ticket. Another good 
follow up ticket would be addressing the allocator for performance and for 
fragmentation.

> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
>                 Key: CASSANDRA-7438
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Linux
>            Reporter: Vijay
>            Assignee: Vijay
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in 
> JVM heap as BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better 
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off 
> heap and use JNI to interact with cache. We might want to ensure that the new 
> implementation match the existing API's (ICache), and the implementation 
> needs to have safe memory access, low overhead in memory and less memcpy's 
> (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to