[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962691#comment-13962691
 ] 

Benedict commented on CASSANDRA-6694:
-------------------------------------

bq. I think Jonathan Ellis mentioned that it might be better to reduce usage of 
the column names instead of merging cell with column name (if I remember 
correctly)

I don't recall this suggestion. Perhaps you are referring to the suggestion 
that we not extract the cell names from the cell as often as we do, for the 
purpose of comparison, in order to reduce garbage production?

bq. Regarding placeholders idea, if we allocate contiguous region for the whole 
cell we can just have memory object + 1 int (or was it even short?...) field 
which marks the end of the column name at that buffer, as column timestamp is a 
fixed size long we know exactly where column value ends, that also helps with 
spatial locality in most of the 

In this case, this suggestion has much more complex problems:

# More (multiple implementation) virtual method invocations (as shown by 
CASSANDRA-6993 this can have meaningfully negative performance implications)
# Major refactor of AbstractType hierarchy to prevent bytebuffer allocation on 
comparison
# More object allocation in the request threads due to having to re-pack all of 
any parameters into a Cell with a single buffer, as opposed to just dropping 
them in place
# At which point it would make most sense to refactor (and mostly eliminate) 
the entirety of CASSANDRA-5417, as we're almost always pumping the result 
straight into a Cell anyway, so extracting the components into separate buffers 
and repacking them into a single buffer in the Cell is very wasteful

That said, it is *viable*. It has some advantages too: the comparisons between 
Native and Buffer cells are much more easily optimised. Many of these changes 
may well need to happen in the natural course of things anyway as we optimise 
the native implementation. But it has comparatively wide-ranging implications 
for the current on-heap use case that might be a bit too much to bite off right 
now.

bq. if it's not essential then we can do it at the very last stage once we done 
with all more important changes which are plenty

I disagree. It makes the patch more complicated to *not* move it around. 
Because something is not essential does not mean it is not the better option

> Slightly More Off-Heap Memtables
> --------------------------------
>
>                 Key: CASSANDRA-6694
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
>
>
> The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
> the on-heap overhead is still very large. It should not be tremendously 
> difficult to extend these changes so that we allocate entire Cells off-heap, 
> instead of multiple BBs per Cell (with all their associated overhead).
> The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
> bytes per cell on average for the btree overhead, for a total overhead of 
> around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
> address (we will do alignment tricks like the VM to allow us to address a 
> reasonably large memory space, although this trick is unlikely to last us 
> forever, at which point we will have to bite the bullet and accept a 24-byte 
> per cell overhead), and 4-byte object reference for maintaining our internal 
> list of allocations, which is unfortunately necessary since we cannot safely 
> (and cheaply) walk the object graph we allocate otherwise, which is necessary 
> for (allocation-) compaction and pointer rewriting.
> The ugliest thing here is going to be implementing the various CellName 
> instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to