[
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136908#comment-14136908
]
Benedict commented on CASSANDRA-7282:
-------------------------------------
Some more numbers, with a warmup dataset to populate the map so that
variability due to throughput rate is reduced. These numbers show the NBHOM
consistently around 3x+ faster, although it introduces per-run variability due
to GC.
{noformat}
Benchmark (readWriteRatio) (type) (warmup) Mode
Samples Score Score error Units
b.b.c.HashOrderedCollections.test 0.9 CSLM 5000000 thrpt
5 1392.273 2918.717 ops/ms
b.b.c.HashOrderedCollections.test 0.9 NBHOM 5000000 thrpt
5 5088.408 8964.885 ops/ms
b.b.c.HashOrderedCollections.test 0.5 CSLM 5000000 thrpt
5 1128.637 2589.679 ops/ms
b.b.c.HashOrderedCollections.test 0.5 NBHOM 5000000 thrpt
5 3406.299 5606.085 ops/ms
b.b.c.HashOrderedCollections.test 0.1 CSLM 5000000 thrpt
5 924.642 1802.045 ops/ms
b.b.c.HashOrderedCollections.test 0.1 NBHOM 5000000 thrpt
5 3311.107 999.896 ops/ms
b.b.c.HashOrderedCollections.test 0 CSLM 5000000 thrpt
5 939.757 1776.641 ops/ms
b.b.c.HashOrderedCollections.test 0 NBHOM 5000000 thrpt
5 2781.503 4723.844 ops/ms
{noformat}
> Faster Memtable map
> -------------------
>
> Key: CASSANDRA-7282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Labels: performance
> Fix For: 3.0
>
> Attachments: profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast
> majority of users use a hash partitioner, it occurs to me we could maintain a
> hybrid ordered list / hash map. The list would impose the normal order on the
> collection, but a hash index would live alongside as part of the same data
> structure, simply mapping into the list and permitting O(1) lookups and
> inserts.
> I've chosen to implement this initial version as a linked-list node per item,
> but we can optimise this in future by storing fatter nodes that permit a
> cache-line's worth of hashes to be checked at once, further reducing the
> constant factor costs for lookups.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)