[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

Pavel Yaskevich (JIRA) Mon, 17 Oct 2016 12:36:36 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583210#comment-15583210
 ]


Pavel Yaskevich commented on CASSANDRA-9754:
--------------------------------------------

[~mkjellman] This looks great! Can you please post information regarding 
SSTables sizes and their estimated key counts as well? AFAIR there exists 
another problem related to how indexes are currently stored - if key is not in 
the key cache there is no way to jump to it directly in the index file, index 
reader has to scan through index segment to find requested key, so I'm 
wondering what happens in the situation when there are many keys which are 
small-to-medium sized e.g. 64-128 MB in each given SSTable (let's say SSTable 
size is set to 1G or 2G) and stress readers are trying to read random keys, 
what would be the difference between current index read performance vs. index + 
birch tree?...

> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
>                 Key: CASSANDRA-9754
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Michael Kjellman
>            Priority: Minor
>             Fix For: 4.x
>
>         Attachments: gc_collection_times_with_birch.png, 
> gc_collection_times_without_birch.png, gc_counts_with_birch.png, 
> gc_counts_without_birch.png, 
> perf_cluster_1_with_birch_read_latency_and_counts.png, 
> perf_cluster_1_with_birch_write_latency_and_counts.png, 
> perf_cluster_2_with_birch_read_latency_and_counts.png, 
> perf_cluster_2_with_birch_write_latency_and_counts.png, 
> perf_cluster_3_without_birch_read_latency_and_counts.png, 
> perf_cluster_3_without_birch_write_latency_and_counts.png
>
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

Reply via email to