[
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220973#comment-15220973
]
Michael Kjellman commented on CASSANDRA-9754:
---------------------------------------------
Alright, good news! My unit test that creates and reads from an index with
100,000,000 entries (!!) successfully passes!
Came up with a pretty nice solution to the word-list issue (unable to find a
word list of 100m+ entries) and instead I am creating n TimeUUID elements --
which nicely removes duplicates, can create an infinite number of, and come
already sorted as they're being generated!
I'm currently profiling the code to come up with numbers...
> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
> Issue Type: Improvement
> Reporter: sankalp kohli
> Assignee: Michael Kjellman
> Priority: Minor
>
> Looking at a heap dump of 2.0 cluster, I found that majority of the objects
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for
> GC. Can this be improved by not creating so many objects?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)