[
https://issues.apache.org/jira/browse/RATIS-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817516#comment-17817516
]
Duong commented on RATIS-2027:
------------------------------
Suggestions:
# Can we use a ReferenceCountedObject of ByteBuf#nioBuffers() as an input to
RemoteStream? This will eliminate all the copies.
# If we can't avoid duplicating the buffer, let's copy it to a pooled
DirectBuffer. This'd avoid the extra copy why writing to network and GC cost.
@szetszwo
> Ratis Streaming: Remote Stream copy data to heap
> ------------------------------------------------
>
> Key: RATIS-2027
> URL: https://issues.apache.org/jira/browse/RATIS-2027
> Project: Ratis
> Issue Type: Improvement
> Components: Streaming
> Reporter: Duong
> Priority: Major
> Labels: performance
> Attachments: allocation.png, cpu.png, dn_write_streaming.htm,
> dn_write_streaming_allocation.htm
>
>
> In ratis streaming, the write to RemoteStream uses ByteBuf#nioBuffer() which
> copies the ByteBuf content to a HeapByteBuffer.
> {code:java}
> CompletableFuture<DataStreamReply> write(DataStreamRequestByteBuf request,
> Executor executor) {
> final Timekeeper.Context context = metrics.start();
> return composeAsync(sendFuture, executor,
> n -> out.writeAsync(request.slice().nioBuffer(),
> addFlush(request.getWriteOptionList()))
> .whenComplete((l, e) -> metrics.stop(context, e == null)));
> } {code}
> And this action implies 2 points of inefficiencies:
> # The created and discarded HeapBuffer creates the O(N) cost to GC, with N
> is the data amount flowing through the ratis server.
> # When the HeapBuffer is written to the network, it needs to be copied again
> to a DriectBuffer by NIO. So, in total, 2 copies.
> !allocation.png|width=722,height=251!
> !cpu.png|width=718,height=261!
> See: [^dn_write_streaming_allocation.htm] and [^dn_write_streaming.htm]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)