[
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323080#comment-14323080
]
Benedict commented on CASSANDRA-7282:
-------------------------------------
bq. This benchmark uses Longs as keys
Thanks for spotting that
bq. However, if even the author of the code can make this mistake so easily
It's worth noting that I was recovering from brain surgery performed a couple
of weeks prior to posting this benchmark, so it's probably not _quite_ as
indicative of the problem as it might appear. I agree it would be preferable to
use a different method, though. The problem is doing so neatly and without
penalty. The best option is probably to make all Token implement a
HashComparable interface, and throw UnsupportedOperationException if they don't
really implement it, but that is also pretty ugly. I'm pretty agnostic to the
solution to this particular problem, so I'll defer to strong opinions.
A potentially more damning criticism of that benchmark is that it assumes a
uniform address space for the hashes. As the number of nodes in a cluster
grows, the entropy in the top bits decreases, and so the performance of this
map could degrade. Besides the aforementioned improvement to bound worst case
behaviour at O(lg(N)) we could also normalise the top order bits across the
owned ranges. Possibly there are some other strategies, but I need to think on
that.
> Faster Memtable map
> -------------------
>
> Key: CASSANDRA-7282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Labels: performance
> Fix For: 3.0
>
> Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg,
> run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast
> majority of users use a hash partitioner, it occurs to me we could maintain a
> hybrid ordered list / hash map. The list would impose the normal order on the
> collection, but a hash index would live alongside as part of the same data
> structure, simply mapping into the list and permitting O(1) lookups and
> inserts.
> I've chosen to implement this initial version as a linked-list node per item,
> but we can optimise this in future by storing fatter nodes that permit a
> cache-line's worth of hashes to be checked at once, further reducing the
> constant factor costs for lookups.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)