[ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13972524#comment-13972524 ]
Benedict commented on CASSANDRA-6694: ------------------------------------- So, on the whole I really don't perceive this approach as better: there's a great deal of code duplication now (set to get worse still when you finish the refactor for DecoratedKey), between each of the correspondingly named cell implementations. Personally I think the Impl approach is neater as a result of avoiding that (this may be more pronounced if we decide to optimise equals() is you suggested). That said, if this moves us forwards I can live with it, if you can address point 1 below. There are a few problems though: # I am *very* opposed to a public setPeer() method. This is a deal breaker for me - but it can be avoided with a bit more refactoring. # Your optimised updateDigest function is actually much slower than the old implementation for all but the smallest values: an optimised version needs to batch the contents into an array (stored in a ThreadLocal) and call updateDigest with the array, unless the total size is very small (there's a crossover point on my laptop of about 12 bytes, under which it's faster to call update(byte)). # AbstractNativeCell.getBytes actually calls setBytes # excessHeapSize... should be unsharedHeapSize... # There should be no hashCode method in Buffer\*Cell - I removed these for a reason. Because we can have a Cell that is a CellName, and vice-versa, using a Cell as a key for a map is likely dangerous. Since we don't do it anywhere, it's safe to simply remove the methods. There may be other minor issues, I'll hold off giving it a formal review until we decide the direction we're going. To respond to a few of your comments: bq. CounterUpdateCell interface is missing as well as NativeCounterUpdateCell implementation to match it. There shouldn't be one for the time being - we can never construct one. bq. CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell Same reason - it doesn't exist as either or, so I made a conscious decision to leave it as a CounterUpdateCell: the fact that it extends BufferCell is kind of unimportant. It's purpose is somewhat different, and I think it is better left named CounterUpdateCell, as that is its purpose (to carry a counter update as far as the memtable, and no further). bq. Impl classes extends another Impl classes which doesn't make much sense as all of the methods are static. This brings in the namespace of the extended class' static methods, which is useful. bq. When taken out of context like that it doesn't really make sense but what I meant, there are situation where we don't really need to get BB from the CellName but can transfer bytes directly (especially for the native cell implementations). Sure, but again: scope of ticket, and care needs to be taken when doing this (e.g. your updateDigest modifications) > Slightly More Off-Heap Memtables > -------------------------------- > > Key: CASSANDRA-6694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6694 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Benedict > Assignee: Benedict > Labels: performance > Fix For: 2.1 beta2 > > > The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as > the on-heap overhead is still very large. It should not be tremendously > difficult to extend these changes so that we allocate entire Cells off-heap, > instead of multiple BBs per Cell (with all their associated overhead). > The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 > bytes per cell on average for the btree overhead, for a total overhead of > around 20-22 bytes). This translates to 8-byte object overhead, 4-byte > address (we will do alignment tricks like the VM to allow us to address a > reasonably large memory space, although this trick is unlikely to last us > forever, at which point we will have to bite the bullet and accept a 24-byte > per cell overhead), and 4-byte object reference for maintaining our internal > list of allocations, which is unfortunately necessary since we cannot safely > (and cheaply) walk the object graph we allocate otherwise, which is necessary > for (allocation-) compaction and pointer rewriting. > The ugliest thing here is going to be implementing the various CellName > instances so that they may be backed by native memory OR heap memory. -- This message was sent by Atlassian JIRA (v6.2#6252)