[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

Jack Krupansky (JIRA) Mon, 11 Apr 2016 07:16:11 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235141#comment-15235141
 ]


Jack Krupansky commented on CASSANDRA-9754:
-------------------------------------------

Any idea how a new wide partition will perform relative to the same amount of 
data and same number of clustering rows divided into bucketed partitions? For 
example, a single 1 GB wide partition vs. ten 100 MB partitions (same partition 
key plus a 0-9 bucket number) vs. a hundred 10 MB partitions (0-99 bucket 
number), for two access patterns: 1) random access a row or short slice, and 2) 
a full bulk read of the 1 GB of data, one moderate slice at a time.

Or maybe the question is equivalent to asking what the cost is to access the 
last row of the 1 GB partition vs. the last row of the tenth or hundredth 
bucket of the bucketed equivalent.

No precision required. Just inquiring whether we can get rid of bucketing as a 
preferred data modeling strategy, at least for the common use cases where the 
sum of the buckets is roughly 2 GB or less..

The bucketing approach does have the side effect of distributing the buckets 
around the cluster, which could be a good thing, or maybe not.

> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
>                 Key: CASSANDRA-9754
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Michael Kjellman
>            Priority: Minor
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

Reply via email to