[ 
https://issues.apache.org/jira/browse/RATIS-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated RATIS-1176:
------------------------------
    Description: 
In RATIS-1175, we provided a WritableByteChannel view of DataStreamOutput in 
order to support FileChannel.transferTo.  However, [~runzhiwang] pointed out 
that sun.nio.ch.FileChannelImpl.transferTo has three submethods
- transferToDirectly (fastest)
- transferToTrustedChannel
- transferToArbitraryChannel (slowest, requires buffer copying)

Unfortunately, our current implementation only able to use 
transferToArbitraryChannel.

There are several ideas below to improve the performance.  We should benchmark 
them.
# Improve the current implementation of WritableByteChannel so that it may be 
able to use a faster transferTo method.
# Use 
[FileChannel.map(..)|https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileChannel.html#map-java.nio.channels.FileChannel.MapMode-long-long-]
 and pass MappedByteBuffer to our DataStreamOutput.writeAsync method.
# Add a new API
{code}
//DataStreamOutput
 CompletableFuture<DataStreamReply> writeAsync(File);
{code}
Internally, use Netty DefaultFileRegion for zero-copy file transfer:
https://github.com/netty/netty/blob/4.1/example/src/main/java/io/netty/example/file/FileServerHandler.java#L53

The data flow of client -> primary -> peer as follows
 If stream file and do not calculate checksum, we use transferTo. In client, 
there are 1 DMA copy and 1 DMA gather copy, no CPU copy. In primary, there are
3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 
 !screenshot-6.png! 

If stream file and calculate checksum, we use MapByteBuffer. In client, there 
are 2 DMA copy and 1 CPU copy.  In primary, there are
3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 
 !screenshot-7.png! 

If stream data not in file and calculate checksum, we use DirectByteBuffer. In 
client, there are 2 DMA copy and 2 CPU copy. In primary, there are
3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 

 !screenshot-8.png! 

we should avoid reading data into heap such as HeapByteBuffer. In client, there 
are 2 DMA copy and 4 CPU copy.   In primary, there are
3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 
 !screenshot-9.png! 

The following is flow before ratis streaming and use ProtoBuf to send data. In 
client there are 2 DMA copy and 4 CPU copy. In leader, there are 3 DMA copy and 
7 CPU copy. In follower, there are 2 DMA copy and 5 CPU copy.
 !screenshot-5.png! 


  was:
In RATIS-1175, we provided a WritableByteChannel view of DataStreamOutput in 
order to support FileChannel.transferTo.  However, [~runzhiwang] pointed out 
that sun.nio.ch.FileChannelImpl.transferTo has three submethods
- transferToDirectly (fastest)
- transferToTrustedChannel
- transferToArbitraryChannel (slowest, requires buffer copying)

Unfortunately, our current implementation only able to use 
transferToArbitraryChannel.

There are several ideas below to improve the performance.  We should benchmark 
them.
# Improve the current implementation of WritableByteChannel so that it may be 
able to use a faster transferTo method.
# Use 
[FileChannel.map(..)|https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileChannel.html#map-java.nio.channels.FileChannel.MapMode-long-long-]
 and pass MappedByteBuffer to our DataStreamOutput.writeAsync method.
# Add a new API
{code}
//DataStreamOutput
 CompletableFuture<DataStreamReply> writeAsync(File);
{code}
Internally, use Netty DefaultFileRegion for zero-copy file transfer:
https://github.com/netty/netty/blob/4.1/example/src/main/java/io/netty/example/file/FileServerHandler.java#L53

The data flow of client -> primary -> peer as follows
 If stream file and do not calculate checksum, we use transferTo. In client, 
there are 1 DMA copy and 1 DMA gather copy, no CPU copy.
 !screenshot-6.png! 

If stream file and calculate checksum, we use MapByteBuffer. In client, there 
are 2 DMA copy and 1 CPU copy.
 !screenshot-7.png! 

If stream data not in file and calculate checksum, we use DirectByteBuffer. In 
client, there are 2 DMA copy and 2 CPU copy.

 !screenshot-8.png! 

we should avoid reading data into heap such as HeapByteBuffer. In client there 
are 2 DMA copy and 4 CPU copy.  
 !screenshot-9.png! 

The following is flow before ratis streaming and use ProtoBuf to send data. In 
client there are 2 DMA copy and 4 CPU copy. In leader, there are 3 DMA copy and 
7 CPU copy. In follower, there are 2 DMA copy and 5 CPU copy.
 !screenshot-5.png! 



> Benchmark various ways to stream data
> -------------------------------------
>
>                 Key: RATIS-1176
>                 URL: https://issues.apache.org/jira/browse/RATIS-1176
>             Project: Ratis
>          Issue Type: Sub-task
>          Components: client, Streaming
>            Reporter: Tsz-wo Sze
>            Priority: Major
>         Attachments: image-2020-11-25-07-40-50-383.png, screenshot-5.png, 
> screenshot-6.png, screenshot-7.png, screenshot-8.png, screenshot-9.png
>
>
> In RATIS-1175, we provided a WritableByteChannel view of DataStreamOutput in 
> order to support FileChannel.transferTo.  However, [~runzhiwang] pointed out 
> that sun.nio.ch.FileChannelImpl.transferTo has three submethods
> - transferToDirectly (fastest)
> - transferToTrustedChannel
> - transferToArbitraryChannel (slowest, requires buffer copying)
> Unfortunately, our current implementation only able to use 
> transferToArbitraryChannel.
> There are several ideas below to improve the performance.  We should 
> benchmark them.
> # Improve the current implementation of WritableByteChannel so that it may be 
> able to use a faster transferTo method.
> # Use 
> [FileChannel.map(..)|https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileChannel.html#map-java.nio.channels.FileChannel.MapMode-long-long-]
>  and pass MappedByteBuffer to our DataStreamOutput.writeAsync method.
> # Add a new API
> {code}
> //DataStreamOutput
>  CompletableFuture<DataStreamReply> writeAsync(File);
> {code}
> Internally, use Netty DefaultFileRegion for zero-copy file transfer:
> https://github.com/netty/netty/blob/4.1/example/src/main/java/io/netty/example/file/FileServerHandler.java#L53
> The data flow of client -> primary -> peer as follows
>  If stream file and do not calculate checksum, we use transferTo. In client, 
> there are 1 DMA copy and 1 DMA gather copy, no CPU copy. In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 
>  !screenshot-6.png! 
> If stream file and calculate checksum, we use MapByteBuffer. In client, there 
> are 2 DMA copy and 1 CPU copy.  In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 
>  !screenshot-7.png! 
> If stream data not in file and calculate checksum, we use DirectByteBuffer. 
> In client, there are 2 DMA copy and 2 CPU copy. In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 
>  !screenshot-8.png! 
> we should avoid reading data into heap such as HeapByteBuffer. In client, 
> there are 2 DMA copy and 4 CPU copy.   In primary, there are
> 3 DMA copy and 3 CPU copy. In peer, there are 2 DMA copy and 2 CPU copy. 
>  !screenshot-9.png! 
> The following is flow before ratis streaming and use ProtoBuf to send data. 
> In client there are 2 DMA copy and 4 CPU copy. In leader, there are 3 DMA 
> copy and 7 CPU copy. In follower, there are 2 DMA copy and 5 CPU copy.
>  !screenshot-5.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to