[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

Benedict (JIRA) Wed, 17 Sep 2014 00:15:08 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136884#comment-14136884
 ]


Benedict commented on CASSANDRA-7282:
-------------------------------------

To add some more data to this discussion, I decided quickly to isolate just the 
CSLM and NBHOM for comparison. Code available 
[here|https://github.com/belliottsmith/bench]. 

Using JMH for writes to an append-only collection is tricky, since any 
increased throughput rate is naturally throttling and incurs greater GC and 
algorithmic overhead, so these numbers are likely to be understating the yield, 
but the NBHOM is anywhere from 1.75-2.75x the throughput of CSLM, depending on 
if the workload is read-heavy or write-heavy. It is likely the real yield for 
writes is inbetween these two numbers.

{noformat}
b.b.c.HashOrderedCollections.test      40000000               0.9    CSLM  
thrpt       10  3593.788      196.459  ops/ms
b.b.c.HashOrderedCollections.test      40000000               0.9   NBHOM  
thrpt       10  9299.991      179.408  ops/ms
b.b.c.HashOrderedCollections.test      40000000               0.5    CSLM  
thrpt       10  2402.021       79.321  ops/ms
b.b.c.HashOrderedCollections.test      40000000               0.5   NBHOM  
thrpt       10  5648.160      294.727  ops/ms
b.b.c.HashOrderedCollections.test      40000000               0.1    CSLM  
thrpt       10  2089.597       71.228  ops/ms
b.b.c.HashOrderedCollections.test      40000000               0.1   NBHOM  
thrpt       10  3665.048      138.384  ops/ms
b.b.c.HashOrderedCollections.test      40000000                 0    CSLM  
thrpt       10  2008.595       46.308  ops/ms
b.b.c.HashOrderedCollections.test      40000000                 0   NBHOM  
thrpt       10  3515.328      214.304  ops/ms
{noformat}

So, as we reduce other overheads in the application and increase memtable 
capacity through further offheap improvements, any yield we are seeing 
application side can only go up.

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

Reply via email to