[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

Ariel Weisberg (JIRA) Tue, 18 Aug 2015 10:19:40 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701626#comment-14701626
 ]


Ariel Weisberg commented on CASSANDRA-8630:
-------------------------------------------

OK, I won't get too involved in trying to benchmark it then. I think we have 
demonstrated it isn't a regression with Stefania's changes to the rate limiting.

Stefania, there is a view in flight recorder that is kind of handy is Threads 
-> Latencies. Under Java Thread Sleep you can see 882 milliseconds spent 
sleeping across 98 instances. This is for the uncompressed case with 8630. That 
sleep time isn't present on trunk. If you have a single thread you want to be 
hot (or piece of code) you can check the latencies view for time spent 
parked/waiting/sleeping/IO. It's not the greatest view because it doesn't group 
by thread.

Since flight recorder mostly only accounts for CPU time it's the view you can 
use to find blocked threads that aren't blocked on contention or IO. Flight 
recorder also doesn't account for time spent faulting memory mapped files :-) 
In retrospect this would be the "flight recorder" way to accomplish the 
discovery you made in visual vm.

> Faster sequential IO (on compaction, streaming, etc)
> ----------------------------------------------------
>
>                 Key: CASSANDRA-8630
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>            Reporter: Oleg Anastasyev
>            Assignee: Stefania
>              Labels: compaction, performance
>             Fix For: 3.x
>
>         Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz, 
> mmaped_uncomp_hotspot.png
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read<Type> and 
> SequencialWriter.write<Type> methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

Reply via email to