[
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983417#comment-13983417
]
Benedict commented on CASSANDRA-6694:
-------------------------------------
I've pushed a completed branch
[here|https://github.com/belliottsmith/cassandra/tree/6694-reorg2]
I've taken to completion your flattening of the PoolAllocator and DataAllocator
hierarchies, implemented DecoratedKey, reintroduced the extra unit tests, fixed
some bugs with the Cell hierarchy, slightly rejigged the data layout for native
cell to simplify offset calculation and fixed a performance regression and the
message digest optimisation.
The only thing I haven't done is the refactors I would like to perform before
we finally commit this, so as to make review easier for others.
Note I'm still running dtests and doing some final vetting, but I wanted to
post this message now as I reckon this version is most likely ready and this is
somewhat time critical, and because I want to avoid any duplicated effort in
getting a final patch together.
I think I've addressed your concern's [~iamaleksey], however with the following
notes:
bq. getAllocator() doesn’t belong to SecondaryIndex, API-wise. CFS#logFlush()
and CFS.FLCF#run() should just use
SecondaryIndexManager#getIndexesNotBackedByCfs() and get their allocators
directly instead of using SIM#getIndexes() and checking for null.
This was a conscious decision to permit custom 2i use our allocators and count
towards book keeping for memory utilisation.
bq. Composite/CellName/CellNameType/etc#copy() all now have an extra CFMetaData
argument, while only NativeCell really uses it. Can we isolate its usage to a
NativeCell-specific methods and leave the rest alone?
Not sure how we do that when either can be present when you want to perform
these calls. Possible I'm missing something obvious though, so please do let me
know :)
> Slightly More Off-Heap Memtables
> --------------------------------
>
> Key: CASSANDRA-6694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Labels: performance
> Fix For: 2.1 beta2
>
>
> The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as
> the on-heap overhead is still very large. It should not be tremendously
> difficult to extend these changes so that we allocate entire Cells off-heap,
> instead of multiple BBs per Cell (with all their associated overhead).
> The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6
> bytes per cell on average for the btree overhead, for a total overhead of
> around 20-22 bytes). This translates to 8-byte object overhead, 4-byte
> address (we will do alignment tricks like the VM to allow us to address a
> reasonably large memory space, although this trick is unlikely to last us
> forever, at which point we will have to bite the bullet and accept a 24-byte
> per cell overhead), and 4-byte object reference for maintaining our internal
> list of allocations, which is unfortunately necessary since we cannot safely
> (and cheaply) walk the object graph we allocate otherwise, which is necessary
> for (allocation-) compaction and pointer rewriting.
> The ugliest thing here is going to be implementing the various CellName
> instances so that they may be backed by native memory OR heap memory.
--
This message was sent by Atlassian JIRA
(v6.2#6252)