[
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075630#comment-14075630
]
Robert Stupp commented on CASSANDRA-7438:
-----------------------------------------
Surely not a complete list, but a start...
Java code:
* com.lruc.util.Utils.getUnsafe can be safely deleted
* com.lruc.util.Utils.extractLibraryFile
** should check return code of {{File.createNewFile}}
** Call to {{File.delete()}} for the extracted library file should be added to
{{com.lruc.util.Utils.loadNative}} since unclean shutdown (kill -9) does not
delete the so/dylib file. Possible for Unix systems - but not for Win.
* Classes com.lruc.jni.lruc, SWIGTYPE_p_item and SWIGTYPE_p_p_item are unused
(refactoring relict?)
* Generally the lruc code could be more integrated in C* code.
** Let the lruc classes implement org.apache.cassandra.io.util.DataOutputPlus
and java.io.DataInput so that they can be directly used by C*
ColumnFamilySerializer (no temporary Input/OutputStreams necessary).
** Maybe {{DataOutputPlus.write(Memory)}} can be removed in C* when lruc is
used - not sure about that.
** Implement most DataInput/Output methods in EntryInput/OutputStream to
benefit from Unsafe (e.g. Unsafe.getLong/putLong) - have seen, that you've
removed Abstract... some weeks ago ;)
** Using Unsafe for DataInput/Output of short/int/long/float/double has the
drawback that Unsafe always uses the system's byte order - not (necessarily)
the portable Java byte order. There's of course no drawback, if all
reads/writes are "paired".
** {{Unsafe.copyMemory}} could be used for {{write(byte[])}}/{{read(byte[])}}.
* Naming of max_size, capacity - should use one common term which also makes
sure that it's a maximum memory size - e.g. max_size_bytes. _Capacity_ is often
used for the number of elements in a collection.
* Memory leak: {{com.lruc.api.LRUCache.hotN}} may keep references in native
code (no {{lruc_deref}} calls), if not all items are retrieved from the
iterator - remove _hotN_ or return an array/list instead?
* Generally I think all classes can be merged into a single package if only a
few a are left (see above)
C code:
* {{#define item_lock(hv) while (item_trylock(hv)) continue;}} shouldn't there
be something like a _yield_ ?
* Seems like the C code was not cleaned up after you began using
Unsafe.allocateMemory :)
* I did not follow all possible code paths (due to the previous point)
Common:
* {{prefix_delimiter}} seems to be unused
Altogether I like that :)
> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Linux
> Reporter: Vijay
> Assignee: Vijay
> Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch
>
>
> Currently SerializingCache is partially off heap, keys are still stored in
> JVM heap as BB,
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off
> heap and use JNI to interact with cache. We might want to ensure that the new
> implementation match the existing API's (ICache), and the implementation
> needs to have safe memory access, low overhead in memory and less memcpy's
> (As much as possible).
> We might also want to make this cache configurable.
--
This message was sent by Atlassian JIRA
(v6.2#6252)