[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963923#comment-13963923
 ] 

Benedict commented on CASSANDRA-6694:
-------------------------------------

bq. less object overhead

There is no reduced overhead from the current patch.

bq. Also, as we consider Composite as a complete entity, storing components as 
contiguous blocks would reduce container overhead + speeds up comparisons by 
exploiting spatial locality

You seem to be backtracking to the prior suggestion of only one implementation. 
I am potentially ok with this, but see my prior comment for concerns and 
complications. The -1 was to having what we have now except with an extra level 
of indirection (i.e. one packed Cell implementation, and one componentised like 
we had before this patch). Also, I would prefer to avoid the extra indirection 
+virtual method costs of having another inner object representation, within 
which we need another offset.

The JVM instruction set is besides the point. The point is what hotspot will 
do: with a single implementor or static method of small enough bytecode 
representation, it will be inlined. Note I said "multiple implementation" 
virtual method. With the option you suggest we will need an extra virtual 
invocation cost with every access to the underlying bytes, some extra math to 
access the right location, and one extra object field reference to locate the 
position we're offsetting from. These costs mount up rapidly.

Hmm. No, I now note your "client" implementation: what exactly is this one? 
Please clarify, as the thrift cell is going to need to be compared with the 
other implementations, and suddenly much of any benefit will disappear. The 
best way to make comparisons cheap and easy is to have both sides of the 
comparison have at least the same layout. If we have to either virtual invoke 
or instanceof check for every comparison, and a different code path for 
comparing each type of representation, there will be a performance impact. As 
such the only main benefit of this approach is eliminated in my eyes. Also, how 
will this "client" implementation achieve its various functions, and define its 
type? Seems like you'll need a duplicate hierarchy still.

> Slightly More Off-Heap Memtables
> --------------------------------
>
>                 Key: CASSANDRA-6694
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
>
>
> The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
> the on-heap overhead is still very large. It should not be tremendously 
> difficult to extend these changes so that we allocate entire Cells off-heap, 
> instead of multiple BBs per Cell (with all their associated overhead).
> The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
> bytes per cell on average for the btree overhead, for a total overhead of 
> around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
> address (we will do alignment tricks like the VM to allow us to address a 
> reasonably large memory space, although this trick is unlikely to last us 
> forever, at which point we will have to bite the bullet and accept a 24-byte 
> per cell overhead), and 4-byte object reference for maintaining our internal 
> list of allocations, which is unfortunately necessary since we cannot safely 
> (and cheaply) walk the object graph we allocate otherwise, which is necessary 
> for (allocation-) compaction and pointer rewriting.
> The ugliest thing here is going to be implementing the various CellName 
> instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to