[jira] [Comment Edited] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

Ariel Weisberg (JIRA) Wed, 19 Aug 2015 15:52:59 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703597#comment-14703597
 ]


Ariel Weisberg edited comment on CASSANDRA-8630 at 8/19/15 10:52 PM:
---------------------------------------------------------------------

Yes you should rebase to 3.0. We port changes forward (I learned this recently 
myself).

* MemoryInputStream.available() can wrap the addition between 
buffer.remaining() + Ints.saturatedCast(memRemaining()). Do the addition and 
then the saturating cast.
* Why does RandomAccessReader accept a builder and a parameter for initializing 
the buffer? Seems like we lose the bonus of a builder a builder allowing a 
constant signature.
* A nit in initializeBuffer, it does firstSegment.value().slice() which implies 
you want a subset of the buffer? duplicate() makes it obvious there is no such 
concern.
* I think there is a place for unit tests stressing the 2 gigabyte boundaries. 
That means testing available()/length()/remaining() style methods as well as 
being able to read and seek with instances of these things that are larger than 
2g. Doing it with the actual file based ones seems bad, but maybe you could 
intercept those to work with memory so they run fast or ingloriously mock their 
inputs.
* For rate limiting is your current solution to consume buffer size bytes from 
the limiter at a time for both mmap reads and standard? And you accomplish this 
by slicing the buffer then updating the position? I don't see you setting the 
limit before slicing?
* I thought NIODataInputStream had methods for reading into ByteBuffers, but I 
was wrong. It's kind of thorny to add one to RebufferingInputStream so I think 
you did the right thing putting it in FileSegmentedInputStream even though it's 
an odd concern to have in that class. Unless you have a better idea.

Stefania is your rework of segment handling still in progress? IOW should I 
hold off until you are done.

[~benedict] In what scenario would we not want to map the file with as few 2 
gigabyte buffers as possible?

-I am still digesting the segments/boundaries/mapping issues.-
Sort of digested it.

* In SSTableReader you are adding and removing fields from files. What are the 
cross version compatibility issues with that?
* There might be something to not remapping entire files every 50 megabytes as 
part of early opening, but it's definitely better as a separate task. It's also 
not clear whether it's going to be faster or just feel better.
* I don't see how an array of pairs can be less indirection then a map, or 
result in less boxing unless there are parallel arrays. One with primitives 
that are the offsets and another that are the ByteBuffers.
* ImmutableSortedMap (or is it navigable?) might split the difference between 
the two approaches.






was (Author: aweisberg):
Yes you should rebase to 3.0. We port changes forward (I learned this recently 
myself).

* MemoryInputStream.available() can wrap the addition between 
buffer.remaining() + Ints.saturatedCast(memRemaining()). Do the addition and 
then the saturating cast.
* Why does RandomAccessReader accept a builder and a parameter for initializing 
the buffer? Seems like we lose the bonus of a builder a builder allowing a 
constant signature.
* A nit in initializeBuffer, it does firstSegment.value().slice() which implies 
you want a subset of the buffer? duplicate() makes it obvious there is no such 
concern.
* I think there is a place for unit tests stressing the 2 gigabyte boundaries. 
That means testing available()/length()/remaining() style methods as well as 
being able to read and seek with instances of these things that are larger than 
2g. Doing it with the actual file based ones seems bad, but maybe you could 
intercept those to work with memory so they run fast or ingloriously mock their 
inputs.
* For rate limiting is your current solution to consume buffer size bytes from 
the limiter at a time for both mmap reads and standard? And you accomplish this 
by slicing the buffer then updating the position? I don't see you setting the 
limit before slicing?
* I thought NIODataInputStream had methods for reading into ByteBuffers, but 
was wrong. It's kind of thorny to add one to RebufferingInputStream so I think 
you did the right thing putting it in FileSegmentedInputStream even though it's 
an odd concern to have in that class. Unless you have a better idea.

Stefania is your rework of segment handling still in progress? IOW should I 
hold off until you are done.

[~benedict] In what scenario would we not want to map the file with as few 2 
gigabyte buffers as possible?

I am still digesting the segments/boundaries/mapping issues.




> Faster sequential IO (on compaction, streaming, etc)
> ----------------------------------------------------
>
>                 Key: CASSANDRA-8630
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>            Reporter: Oleg Anastasyev
>            Assignee: Stefania
>              Labels: compaction, performance
>             Fix For: 3.x
>
>         Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz, 
> mmaped_uncomp_hotspot.png
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read<Type> and 
> SequencialWriter.write<Type> methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

Reply via email to