[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701566#comment-14701566
 ] 

Benedict edited comment on CASSANDRA-8630 at 8/18/15 4:58 PM:
--------------------------------------------------------------

I guess it depends on what we're trying to demonstrate. This ticket was meant 
to be a LHF, to fix some obvious problems with RAR:

* If we're just trying to establish how much faster "sequential IO" is (as the 
ticket is targeting) we probably just want to see how quickly we can read the 
contents of an sstable from start to finish. 
** It's worth noting that this may be more impactful on 2.2 or below, as we 
made a great deal more use of readInt(), which has a terribly inefficient 
implementation. 3.0 and trunk now use readUnsignedVInt a great deal more, and 
this is considerably more efficient.
* If we're trying to establish how much faster compaction gets as a result, we 
probably want to test between 4 and 10 files, since the former is what we'll 
compact with STCS, and the latter with LCS, AFAIK.

Personally I'm only interested in quickly confirming this ticket improves the 
basic properties it's aiming for. A wider scope analysis of compaction 
performance is very much necessary, but can wait until after 3.0 ships.

edit: Another option is a microbenchmark comparing performance of a sequence of 
many reads of each basic data type, however the main approach of this patch is 
to simply exploit the classes we've already spent time optimising this within, 
so I kind of think most of these corroborations are not super necessary, beyond 
broad strokes. This patch is at minimum a part of the general steady wave of 
cleansing we're applying to a messy and fractured bit of the codebase.


was (Author: benedict):
I guess it depends on what we're trying to demonstrate. This ticket was meant 
to be a LHF, to fix some obvious problems with RAR:

* If we're just trying to establish how much faster "sequential IO" is (as the 
ticket is targeting) we probably just want to see how quickly we can read the 
contents of an sstable from start to finish. 
** It's worth noting that this may be more impactful on 2.2 or below, as we 
made a great deal more use of readInt(), which has a terribly inefficient 
implementation. 3.0 and trunk now use readUnsignedVInt a great deal more, and 
this is considerably more efficient.
* If we're trying to establish how much faster compaction gets as a result, we 
probably want to test between 4 and 10 files, since the former is what we'll 
compact with STCS, and the latter with LCS, AFAIK.

Personally I'm only interested in quickly confirming this ticket improves the 
basic properties it's aiming for. A wider scope analysis of compaction 
performance is very much necessary, but can wait until after 3.0 ships.

> Faster sequential IO (on compaction, streaming, etc)
> ----------------------------------------------------
>
>                 Key: CASSANDRA-8630
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>            Reporter: Oleg Anastasyev
>            Assignee: Stefania
>              Labels: compaction, performance
>             Fix For: 3.x
>
>         Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz, 
> mmaped_uncomp_hotspot.png
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read<Type> and 
> SequencialWriter.write<Type> methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to