[
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583249#comment-15583249
]
Michael Kjellman commented on CASSANDRA-9754:
---------------------------------------------
In regards to your second point: I'm actually only am using the key cache in
the current implementation if a) it's a legacy index that hasn't been upgraded
yet (to keep performance for indexed rows the same during upgrades) b) if the
row is "non" indexed, or < 64kb so just the starting offset.
For Birch indexed rows they always come from the Birch impl on disk and don't
get stored in the key cache at all. Ideally I think it would be great if we
could get rid of the key cache all together! There was some chat about this in
the ticket earlier...
There is the index summary which has an offset for keys as they are sampled
during compaction which let you skip to a given starting file offset inside the
index for a key which reduces the problem you're talking about. I don't think
the performance of the small-to-medium sized case should be any different with
the Birch implementation than the current implementation and I'm trying to test
that with the workload going on for the test_keyspace.largeuuid1 table. The
issue with the Birch implementation vs the current though is going to be the
size of the index file on disk due to the segments being aligned on 4kb
boundaries. I've talked a bunch about this and thrown some ideas around with
people and I think maybe the best case would be to check if the previously
added row was a non-indexed segment (so just a long for the start of the
partition in the index and no tree being built) and then don't align the file
to a boundary for those cases. The real issue is I don't know the length ahead
of time so I can't just encode the aligned segments at the end starting at some
starting offset and encode relative offsets iteratively during compaction. Any
thoughts on this would be really appreciated though...
> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
> Issue Type: Improvement
> Reporter: sankalp kohli
> Assignee: Michael Kjellman
> Priority: Minor
> Fix For: 4.x
>
> Attachments: gc_collection_times_with_birch.png,
> gc_collection_times_without_birch.png, gc_counts_with_birch.png,
> gc_counts_without_birch.png,
> perf_cluster_1_with_birch_read_latency_and_counts.png,
> perf_cluster_1_with_birch_write_latency_and_counts.png,
> perf_cluster_2_with_birch_read_latency_and_counts.png,
> perf_cluster_2_with_birch_write_latency_and_counts.png,
> perf_cluster_3_without_birch_read_latency_and_counts.png,
> perf_cluster_3_without_birch_write_latency_and_counts.png
>
>
> Looking at a heap dump of 2.0 cluster, I found that majority of the objects
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for
> GC. Can this be improved by not creating so many objects?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)