[
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15538947#comment-15538947
]
Michael Kjellman commented on CASSANDRA-9754:
---------------------------------------------
Wanted to post a quick update on the ticket. I've been working pretty much
around the clock for the last two weeks on stabilizing, performance testing,
validating, and bug fixing the code. I had an unfortunate unexpected death in
my family last week so I lost the better part of this past week tying up the
last pieces I was finishing up before I got the bad news.
After attempting to work with a few people in the community to get
cassandra-stress working in a way that actually stresses large partitions and
validates the data written into it, I ended up needing to write a stress tool.
I loaded up a few hundred 30GB+ partitions with column sizes of 300-2048 bytes
while constantly reading data that was sampled during the inserts to make sure
I'm not returning bad data or incorrect results.
I ran the most recent load for ~2 days in a small performance cluster and there
were no validation errors. Additionally, I'm running the exact same stress/perf
load in another identical cluster with a 2.1 build that does *not* contain
Birch. This is allowing me to make objective A/B comparisons between the two
builds.
The build is stable, there are no exceptions or errors in the logs even under
pretty high load (the instances are doing 3x the load we generally run at in
production) and most importantly GC is *very* stable. In contrast, GC starts
off great without Birch but around the time the large partitions generated by
the stress tool reached ~250MB GC shot up and then started increasing
literally as the row increased (as expected). In contrast, the cluster with the
Birch build had no change in GC as the size of the partitions increased.
I was a bit disappointed with some of the latencies I saw on reads in the upper
percentiles and so I've identified what I'm almost positive was the cause and
just finished up refactoring the logic for serializing/deserializing the
aligned segments and subsegments in PageAlignedWriter/PageAlignedReader.
I'm cleaning up the commit now and then going to get it into the perf cluster
to start another load. If that looks good hoping to push all the stability and
performance changes I've made up to my public Github branch most likely Tuesday
as I'd like to let the performance load run for 2 days to build up large enough
partitions to accurately stress and test things. :)
> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
> Issue Type: Improvement
> Reporter: sankalp kohli
> Assignee: Michael Kjellman
> Priority: Minor
> Fix For: 4.x
>
> Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff
>
>
> Looking at a heap dump of 2.0 cluster, I found that majority of the objects
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for
> GC. Can this be improved by not creating so many objects?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)