rickyma commented on PR #1521:
URL:
https://github.com/apache/incubator-uniffle/pull/1521#issuecomment-1942153288
After stress testing the shuffle server without this PR, we will easily
encounter `OutOfDirectMemoryError`, which means this PR is necessary.
[epollEventLoopGroup-3-45] [WARN] TransportChannelHandler.exceptionCaught -
Exception in connection from /127.0.0.1:58767
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 4194304
byte(s) of direct memory (used: 161061273600, max: 161061273600)
at
io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:843)
at
io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:772)
at
io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:710)
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:685)
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:212)
at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:194)
at io.netty.buffer.PoolArena.allocate(PoolArena.java:136)
at io.netty.buffer.PoolArena.allocate(PoolArena.java:126)
at
io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:397)
at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
at
org.apache.uniffle.common.netty.protocol.Decoders.decodeShuffleBlockInfo(Decoders.java:50)
at
org.apache.uniffle.common.netty.protocol.SendShuffleDataRequest.decodePartitionData(SendShuffleDataRequest.java:95)
at
org.apache.uniffle.common.netty.protocol.SendShuffleDataRequest.decode(SendShuffleDataRequest.java:107)
at
org.apache.uniffle.common.netty.protocol.Message.decode(Message.java:145)
at
org.apache.uniffle.common.netty.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:77)
We can see that each time an out-of-direct-memory error occurs, it is caused
by the code
`org.apache.uniffle.common.netty.protocol.Decoders.decodeShuffleBlockInfo(Decoders.java:50)`.
This is the most direct trigger for insufficient direct memory.
Because when a large number of requests arrive simultaneously, there might
be a brief period (before the `TransportFrameDecoder` has a chance to
`release`) during which the shuffle server has double the created `ByteBuf`.
This means, for a very short time, the direct memory usage is doubled, which is
extremely uncontrollable.
That is why it is very easy to cause an out-of-direct-memory error without
this PR.
So, we need this PR anyway. We might slow down the flushing process a little
bit(from the results of my tests, there doesn't seem to be any impact on
performance.), but the shuffle server will at least remain available during the
whole stress test.
Maybe we should prioritize ensuring availability first, and then consider
deeper performance optimization later on?
WDYT? @jerqi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]