[ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105022#comment-14105022
 ] 

Benedict commented on CASSANDRA-7282:
-------------------------------------

Ryan provided me with logs, and we can see a flush trigger exactly as the 0.5 
mct write test finishes, explaining the dip. However the temporary dip below 
normal is a bit odd and I'd like to explain it better - best guess is key cache 
being populated; there is a slower ramp up for the normal run as well, here, 
which would be consistent with it (since it flushed earlier). We could try a 
run with key cache disabled, to confirm.

I'd also like to be able to explain the different GC characteristics - we GC 
less often with the new code, but have one _very_ long ParNew pause, fairly 
consistently. Currently I don't have a great explanation for this, except 
possibly the very large array allocations we need to do. Possibly switching to 
an array-of-arrays would be superior, keeping each of the arrays preferably < 
1M or so

For the mct 0.99 run, what's most surprising is reads are _not_ particularly 
faster, which is not what I found running locally, and also conflicts with the 
non-trivial increase in write performance. Back to the drawing board to try and 
figure out exactly what's happening.

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to