[ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137603#comment-14137603
 ] 

Jason Brown commented on CASSANDRA-7282:
----------------------------------------

Food for thought:

I've reviewed the branch Benedict has linked, and while I'll run the stress 
later today, I'd like to add some thoughts on the code itself. 

To perhaps state the obvious from reading the patch, the field Memtable.rows is 
now a InsertOnlyOrderedMap instead of the previous ConcurrentSkipListMap. If 
the Partitioner sorts data by hash code (sortsByHashCode()), Memtable will use 
the simple IOOM.Adapter implementation; else Memtable uses the 
NonBlockingHashOrderedMap. I think we can agree NBHOM is considerably more code 
and more involved than IIOM.Adapter (wait 'til the end, please, Benedict :) ).

Currently, only the Murmur3Partitioner is written to sort by hash code. I would 
venture to guess the vast majority of new clusters created >= cassandra 1.2 use 
the Murmur3Partitioner, and, further, the vast majority of clusters created 
before 1.2 used RandomPartitioner. Thus, not many clusters use the ordered 
partitioners; we've also steered folks away from those unless they really know 
what they are doing (and then we still try to dissuade them).If we take the 
assumption that most clusters' partioner is either Murmur3 or Random, and weigh 
it against the the current patch, the majority of the code (and, what some have 
argued, the complexity) is to support the RandomPartition and the ordered 
partitioners. If we can make RandomPartitioner be able to "sort by hash code", 
and thus be able to take advantage of IIOM.Adapter's functionality, then the 
NBHOM-related functionality would only be necessary for for ordered 
partitioners (let's leave LocalPartitioner, used for indices, out for a 
moment). It seems to me that we could pass on the NBHOM-related functionality 
as it would only be beneficial for a small subset of uses case; not zero use 
cases, mind you, but perhaps enough to take a pass on it for this round. I 
think some commenters concerns about the perceived complexity of the patch 
would then be assuaged, and we can deliver an improvement performance for the 
majority of uses.

[~benedict], Please let me know if I've misunderstood something fundamental 
about the patch that would invalidate my arguments; the 
conclusion/recommendation is, of course, open for discussion.

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to