[
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239729#comment-14239729
]
Ariel Weisberg commented on CASSANDRA-7438:
-------------------------------------------
Lot's of cool stuff here Robert.
Unit test wise there is a lot of code that is only covered indirectly or not at
all and the behaviors are not checked for explicitly. I don't think it makes
sense to include code that doesn't doesn't have a unit test claiming it does
what is says it does. The various input/output streams, buffering, and
compression all need units tests. Uns.java needs a unit test for pretty much
every method as well as the validation functionality. HashEntry has a bunch of
uncovered functions. For me the lack of test coverage is the biggest barrier.
What I am reacting to is that the tests are black box and miss things.
OffHeapMap containsEntry has no tests. removeEntry has untested code.
removeLink still has untested code. There is untested histogram stuff,
deserializeEntry, serializeEntry. HashEntry classes have untested functions.
HashEntries has many predicates that are untested.
Having a unit test that fuzzes against a parallel implementation at the same
time using a different LRU map implementation would be great for a black box
test. You can stripe the other implementation the same way so that the eviction
matches.
One of my previous comments was that SegmentCacheImpl duplicates reference
counting code from OffHeapMap and should just delegate. It ends up doing that
anyways.
I would really like to see the cleanup/eviction code go away. If inserting an
entry would blow capacity remove entries until it doesn't. I don't see a reason
to monkey with thresholds.
At some point the existing C* cache interface needs to gel with your work.
Right now C* uses the hotN and getKeys interface to return the contents of the
cache for persistence. I think the path of least resistance to start would be
to implement the existing interface and then come back and look at how to get
compression and more efficient IO into all the implementations. The existing
stuff in C* doesn't do compression and doesn't buffer its IO. I would prefer to
minimize major changes to the existing C* code. I want to get it working and
then iterate further for other improvements like more efficient cache
serialization.
You could change the OHC interface or implement an adapter. I think it's fine
to modify ICache to return iterables or iterators instead of a collections to
incrementally produce key set and hot keys. For everything else I would really
like to see things to stay the same unless there is something to be gained by
changing the interface.
> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Linux
> Reporter: Vijay
> Assignee: Vijay
> Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in
> JVM heap as BB,
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off
> heap and use JNI to interact with cache. We might want to ensure that the new
> implementation match the existing API's (ICache), and the implementation
> needs to have safe memory access, low overhead in memory and less memcpy's
> (As much as possible).
> We might also want to make this cache configurable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)