[
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137603#comment-14137603
]
Jason Brown commented on CASSANDRA-7282:
----------------------------------------
Food for thought:
I've reviewed the branch Benedict has linked, and while I'll run the stress
later today, I'd like to add some thoughts on the code itself.
To perhaps state the obvious from reading the patch, the field Memtable.rows is
now a InsertOnlyOrderedMap instead of the previous ConcurrentSkipListMap. If
the Partitioner sorts data by hash code (sortsByHashCode()), Memtable will use
the simple IOOM.Adapter implementation; else Memtable uses the
NonBlockingHashOrderedMap. I think we can agree NBHOM is considerably more code
and more involved than IIOM.Adapter (wait 'til the end, please, Benedict :) ).
Currently, only the Murmur3Partitioner is written to sort by hash code. I would
venture to guess the vast majority of new clusters created >= cassandra 1.2 use
the Murmur3Partitioner, and, further, the vast majority of clusters created
before 1.2 used RandomPartitioner. Thus, not many clusters use the ordered
partitioners; we've also steered folks away from those unless they really know
what they are doing (and then we still try to dissuade them).If we take the
assumption that most clusters' partioner is either Murmur3 or Random, and weigh
it against the the current patch, the majority of the code (and, what some have
argued, the complexity) is to support the RandomPartition and the ordered
partitioners. If we can make RandomPartitioner be able to "sort by hash code",
and thus be able to take advantage of IIOM.Adapter's functionality, then the
NBHOM-related functionality would only be necessary for for ordered
partitioners (let's leave LocalPartitioner, used for indices, out for a
moment). It seems to me that we could pass on the NBHOM-related functionality
as it would only be beneficial for a small subset of uses case; not zero use
cases, mind you, but perhaps enough to take a pass on it for this round. I
think some commenters concerns about the perceived complexity of the patch
would then be assuaged, and we can deliver an improvement performance for the
majority of uses.
[~benedict], Please let me know if I've misunderstood something fundamental
about the patch that would invalidate my arguments; the
conclusion/recommendation is, of course, open for discussion.
> Faster Memtable map
> -------------------
>
> Key: CASSANDRA-7282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Labels: performance
> Fix For: 3.0
>
> Attachments: profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast
> majority of users use a hash partitioner, it occurs to me we could maintain a
> hybrid ordered list / hash map. The list would impose the normal order on the
> collection, but a hash index would live alongside as part of the same data
> structure, simply mapping into the list and permitting O(1) lookups and
> inserts.
> I've chosen to implement this initial version as a linked-list node per item,
> but we can optimise this in future by storing fatter nodes that permit a
> cache-line's worth of hashes to be checked at once, further reducing the
> constant factor costs for lookups.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)