[
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224904#comment-14224904
]
Vijay commented on CASSANDRA-7438:
----------------------------------
{quote}
sun.misc.Hashing doesn't seem to exist for me, maybe a Java 8 issue?
StatsHolder, same AtomicLongArray suggestion. Also consider LongAdder.
{quote}
Yep, and let me find alternatives for Java 8 (and until 8 for LongAdder).
{quote}
The queue really needs to be bounded, producer and consumer could proceed at
different rates.
In Segment.java in the replace path AtomicLong.addAndGet is called back to
back, could be called once with the math already done. I believe each of those
stalls processing until the store buffers have flushed. The put path does
something similar and could have the same optimization.
{quote}
Yeah those where a oversight.
{quote}
Tasks submitted to executor services via submit will wrap the result including
exceptions in a future which silently discards them.
The library might take at initialization time a listener for these errors, or
if it is going to be C* specific it could use the wrapped runnable or similar.
{quote}
Are you suggesting a configurable logging/exception handling in case the 2
threads throw exceptions? If yes sure. Other exceptions AFAIK are already
propagated. (Still needs cleanup though).
{quote}
A lot of locking that was spin locking (which unbounded I don't think is great)
is now blocking locking. There is no adaptive spinning if you don't use
synchronized. If you are already using unsafe maybe you could do monitor
enter/exit. Never tried it.
Having the table (segments) on heap is pretty undesirable to me. Happy to be
proved wrong, but I think a flyweight over off heap would be better.
{quote}
Segments are small in memory so far in my tests, The spin lock is to make sure
the lock checks the segment if rehash happened or not, this is better than
having a seperate lock which will be central. (No different than java or
memcached).
Not sure if i understand the UNSAFE lock any example will help.
The segments are in heap mainly to handle the locking, I think we can do a bit
of CAS but global lock on rehashing will be a problem (May be an alternate
approach is required).
{quote}
It looks like concurrent calls to rehash could cause the table to rehash twice
since the rebalance field is not CASed. You should do the volatile read, and
then attempt the CAS (avoids putting the cache line in exclusive state every
time).
{quote}
Nope it is Single threaded Executor and the rehash boolean is already volatile
:)
Next commit will have conditions instead (similar to C implementation).
{quote}
If the expiration lock is already locked some other thread is doing the
expiration work. You might keep a semaphore for puts that bypass the lock so
other threads can move on during expiration. I suppose after the first few
evictions new puts will move on anyways. This would show up in a profiler if it
were happening.
{quote}
Good point… Or a tryLock to spin and check if some other thread released enough
memory.
{quote}
hotN looks like it could lock for quite a while (hundreds of milliseconds,
seconds) depending on the size of N. You don't need to use a linked list for
the result just allocate an array list of size N. Maybe hotN should be able to
yield, possibly leaving behind an iterator that evictors will have to repair.
Maybe also depends on how top N handles duplicate or multiple versions of keys.
Alternatively hotN could take a read lock, and writers could skip the cache?
{quote}
We cannot have duplicates in the Queue (remember it is a double linked list of
items in cache). Read locks q_expiry_lock is all we need, let me fix it.
> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Linux
> Reporter: Vijay
> Assignee: Vijay
> Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch
>
>
> Currently SerializingCache is partially off heap, keys are still stored in
> JVM heap as BB,
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off
> heap and use JNI to interact with cache. We might want to ensure that the new
> implementation match the existing API's (ICache), and the implementation
> needs to have safe memory access, low overhead in memory and less memcpy's
> (As much as possible).
> We might also want to make this cache configurable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)