[
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15538960#comment-15538960
]
Michael Kjellman commented on CASSANDRA-9754:
---------------------------------------------
Oh, another very important update. Originally, I was mmapping 4kb aligned
chunks as necessary. When I finally got things stable due to a few file
descriptor leaks and fun fighting Java with MemoryByteBuffer objects I ran the
performance load from the stress tool I wrote and found the performance was
randomly *terrible* (like 1.3 SECONDS in the 99.9th percentile). Upon
investigation and a ton instrumentation I found mmap calls were taking *90+ms*
in the 99th percentile and *70+ms* in the 90th percentile on the hardware I'm
using for performance testing. I looked into the JDK source code to figure out
if there were any synchronized blocks in the native code but it's pretty sane
and just calls the mmap syscall. Discussed it a bit with Norman Maurer and we
both came up pretty shocked that mmap could be that slow! These boxes have
256GB of RAM and there was basically zero disk IO as everything was in the page
cache as expected. There were a lot of major page faults but really very very
surprising mmap can be so horrible in the upper percentiles.
I ripped out all the mmap logic on the read path and switched to directly
reading from the RAF from the aligned 4kb chunks as needed and everything
looked amazing.
> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
> Issue Type: Improvement
> Reporter: sankalp kohli
> Assignee: Michael Kjellman
> Priority: Minor
> Fix For: 4.x
>
> Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff
>
>
> Looking at a heap dump of 2.0 cluster, I found that majority of the objects
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for
> GC. Can this be improved by not creating so many objects?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)