[ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15538960#comment-15538960
 ] 

Michael Kjellman commented on CASSANDRA-9754:
---------------------------------------------

Oh, another very important update. Originally, I was mmapping 4kb aligned 
chunks as necessary. When I finally got things stable due to a few file 
descriptor leaks and fun fighting Java with MemoryByteBuffer objects I ran the 
performance load from the stress tool I wrote and found the performance was 
randomly *terrible* (like 1.3 SECONDS in the 99.9th percentile). Upon 
investigation and a ton instrumentation I found mmap calls were taking *90+ms* 
in the 99th percentile and *70+ms* in the 90th percentile on the hardware I'm 
using for performance testing. I looked into the JDK source code to figure out 
if there were any synchronized blocks in the native code but it's pretty sane 
and just calls the mmap syscall. Discussed it a bit with Norman Maurer and we 
both came up pretty shocked that mmap could be that slow! These boxes have 
256GB of RAM and there was basically zero disk IO as everything was in the page 
cache as expected. There were a lot of major page faults but really very very 
surprising mmap can be so horrible in the upper percentiles.

I ripped out all the mmap logic on the read path and switched to directly 
reading from the RAF from the aligned 4kb chunks as needed and everything 
looked amazing.

> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
>                 Key: CASSANDRA-9754
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Michael Kjellman
>            Priority: Minor
>             Fix For: 4.x
>
>         Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff
>
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to