[
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274409#comment-14274409
]
Ariel Weisberg commented on CASSANDRA-7438:
-------------------------------------------
I did another review. The additional test coverage looks great.
Don’t throw Error, throw runtime exceptions on things like serialization
issues. The only place it make sense to throw error is when allocating memory
fails. That would match the behavior of ByteBuffer.allocateDirect. I don’t see
failure to allocate from the heap allocator as recoverable even in the context
of the cache. IOError is thrown from one place in the entire JDK (Console) so
it's an odd choice.
freeCapacity should reall be a field inside each segment and full/not full and
eviction decisions should be made inside each segment independently. In
practice inside C* it’s probably fine as just an AtomicLong, but I want to see
OHC be all it can be.
Rehash test could validate the data. After the rehash. It could also validate
the rehash under concurrent access, say have a reader thread that is randomly
accessing values < the already inserted value.I can’t tell if the crosscheck
test inserts enough values to trigger rehashing.
Inlining the murmur3 changes makes me a little uncomfortable. It’s good see see
some test coverage comparing with another implementation, but it’s over a small
set of data. It seems like the Unsigned stuff necessary to perfectly mimic the
native version of murmur3 is missing?
Add 2-4 byte coed points for the UTF-8 tests.
FastUtil is a 17 megabyte dependency all to get one array list.
The cross checking implementation is really nice.
Looking at the AbstractKeyIterator, I don’t see how it can do the right thing
when a segment rehashes. It will point to a random spot in the segment after a
rehash right? In practice maybe this doesn’t matter since they should size up
promptly and it’s just an optimization that we dump this stuff at all. I can
understand what the current code does so I lean towards keeping it.
There are a couple of places (serializeForPut, putInternal, maybe others) where
there are two exception handlers that each de-allocate the same piece of
memory. The deallocation could go in a finally instead of the exception
handlers since it always happens.
> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Linux
> Reporter: Vijay
> Assignee: Robert Stupp
> Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in
> JVM heap as BB,
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off
> heap and use JNI to interact with cache. We might want to ensure that the new
> implementation match the existing API's (ICache), and the implementation
> needs to have safe memory access, low overhead in memory and less memcpy's
> (As much as possible).
> We might also want to make this cache configurable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)