[ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104014#comment-14104014
 ] 

Benedict commented on CASSANDRA-7282:
-------------------------------------

A few possibilities:

1) very few data points, so difficult to draw much conclusion
2) the 99.9th%ile border is arbitrary with so few data points, as evidenced by 
the better max latency
3) the largest latencies are ordinarily encountered either during CL or 
memtable flush; since we're writing faster, it's possible we are simply putting 
the disks under increased pressure and hitting higher latencies more often, 
bringing the 99.9th%ile closer to max

Ryan - I'd suggest running with offheap_objects, raising the commit log size 
limit to 16Gb, a memtable_cleanup_threshold of 0.99, setting your on heap 
memtable capacity to 4Gb, your offheap to 2Gb, and running the test with just 
one column of around 32 bytes. Then use 48M items and perform your normal run 
of insert then read; we *shouldn't* see any flush result from that if I did my 
quick bit of math correctly.


> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to