[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654789#comment-14654789
 ] 

Stefania commented on CASSANDRA-8630:
-------------------------------------

Thanks for the heads up on 9500.

I basically introduced a helper method to return an intermediate object for the 
slow path. I used a ByteBuffer but I can change it to use a Long. I've copied 
some sample code below.

The point is that the switching of buffers cannot have any left over bytes in 
the buffer itself, else we can't plug in the memory-mapped buffers without 
copying data into another buffer.

{code}
    @Override
    public int readInt() throws IOException
    {
        if (buffer.remaining() >= 4)
            return buffer.getInt();  
        else
            return rebufferWithRemaining(4).getInt();
    }

    private ByteBuffer rebufferWithRemaining(int minimum) throws IOException
    {
        assert(buffer.remaining() < minimum);
        byte[] b = new byte[minimum];

       // put remaining bytes in b

       reBuffer() // here the buffer must be entirely consumed

       // add missing bytes to b, 
       // throw EOFException it not enough bytes

       return ByteBuffer.wrap(b);
    }
{code}

The method intended to be overwritten is {{reBuffer}}:
* The default implementation in NIODataInputStream continues reading from the 
channel without page alignment, as it does at the moment
* RAR either reads page aligned buffers or swaps in memory mapped files
* MemoryInputStream swaps in hollow byte buffers that wrap native memory.

Another choice we have is whether we are happy to keep on using the ByteBuffer 
get() methods or whether we should write our own? 


> Faster sequential IO (on compaction, streaming, etc)
> ----------------------------------------------------
>
>                 Key: CASSANDRA-8630
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>            Reporter: Oleg Anastasyev
>            Assignee: Stefania
>              Labels: compaction, performance
>             Fix For: 3.x
>
>         Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read<Type> and 
> SequencialWriter.write<Type> methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to