[
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702681#comment-14702681
]
Benedict edited comment on CASSANDRA-8630 at 8/19/15 8:10 AM:
--------------------------------------------------------------
bq. RateLimiter is not a final class.
I think Ariel was suggesting a new class that explicitly performs no work.
However, since we use this class more often for reads than we do for
compaction, I would prefer we stick with the more performant option of just
null checking. Certainly using a full-fat RateLimiter is more expensive than
this
bq. Have a look at MmappedSegmentedFile.Builder.addPotentialBoundary() and
createSegments()
I should have written a bit about this before work started: my expectation is
that this can all be completely removed. The reason for it was that we treated
each mmap file segment as completely distinct, so we had to have each partition
end before a 2G boundary (so we could map the entirety), or we had to use a
non-mmap segment. That's no longer the case, since we just rebuffer, so we can
safely eliminate all of the mess with segment boundaries, and just map in
increments of 2G (or, frankly, whatever we feel like. It might be nice to do it
exactly once when we "early open" so that we do not remap the same regions
multiple times). At the same time we can eliminate the idea of multiple
segments; we should always have just one segment. Given this, we should also
consider renaming them, since they're no longer "segments" - they cover the
whole file.
(Caveat: I haven't reviewed the code directly, I'm just going off the comments)
was (Author: benedict):
bq. RateLimiter is not a final class.
I think Ariel was suggesting a new class that explicitly performs no work.
However, since we use this class more often for reads than we do for
compaction, I would prefer we stick with the more performant option of just
null checking. Certainly using a full-fat RateLimiter is more expensive than
this
bq. Have a look at MmappedSegmentedFile.Builder.addPotentialBoundary() and
createSegments()
I should have written a bit about this before work started: my expectation is
that this can all be completely removed. The reason for it was that we treated
each mmap file segment as completely distinct, so we had to have each partition
end on a 2G boundary (so we could map the entirety). That's no longer the case,
since we just rebuffer, so we can safely eliminate all of the mess with segment
boundaries, and just map in increments of 2G (or, frankly, whatever we feel
like. It might be nice to do it exactly once when we "early open" so that we do
not remap the same regions multiple times). At the same time we can eliminate
the idea of multiple segments; we should always have just one segment. Given
this, we should also consider renaming them, since they're no longer "segments"
- they cover the whole file.
(Caveat: I haven't reviewed the code directly, I'm just going off the comments)
> Faster sequential IO (on compaction, streaming, etc)
> ----------------------------------------------------
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
> Issue Type: Improvement
> Components: Core, Tools
> Reporter: Oleg Anastasyev
> Assignee: Stefania
> Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png,
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz,
> mmaped_uncomp_hotspot.png
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as
> their matching write* are implemented with numerous calls of byte by byte
> read and write.
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read<Type> and
> SequencialWriter.write<Type> methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30%
> faster on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction.
> (I attached a cpu load graph from one of our production, orange is niced CPU
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)