[
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190718#comment-15190718
]
Michael Kjellman commented on CASSANDRA-9754:
---------------------------------------------
I have the new FileSegment friendly implementation working for the following
conditions:
1) straight search for key -> get value
2) iterate efficiently both forwards and reversed thru all elements in the tree
3) binary search for a given key and then iterate thru all remaining keys from
the found offset
4) overflow page for handling variable length tree elements that exceed the max
size for a given individual page (up to 2GB)
I also have successfully ran some new unit tests I wrote that now do 5000
consecutive iterations with randomly generated data (to "fuzz" the tree for
edge conditions) for building and validating trees that contain between
300,000-500,000 elements. I've also spent a good amount of time writing some
pretty reasonable documentation of the binary format itself.
Tomorrow, I'm planning on testing a 4.5GB individual tree against the new
implementation and doing some profiling to see the exact memory impact now that
it's basically completed on both the serialization and deserialization paths.
Will update with those findings tomorrow!
> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
> Issue Type: Improvement
> Reporter: sankalp kohli
> Assignee: Michael Kjellman
> Priority: Minor
>
> Looking at a heap dump of 2.0 cluster, I found that majority of the objects
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for
> GC. Can this be improved by not creating so many objects?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)