rickyma commented on PR #1521:
URL: 
https://github.com/apache/incubator-uniffle/pull/1521#issuecomment-1943178128

   > > I reopened this PR.
   > > After stress testing the shuffle server without this PR, we will easily 
encounter `OutOfDirectMemoryError`, which means this PR is necessary.
   > > [epollEventLoopGroup-3-45] [WARN] 
TransportChannelHandler.exceptionCaught - Exception in connection from 
/127.0.0.1:58767 io.netty.util.internal.OutOfDirectMemoryError: failed to 
allocate 4194304 byte(s) of direct memory (used: 161061273600, max: 
161061273600) at 
io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:843)
 at 
io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:772)
 at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:710) at 
io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:685) at 
io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:212) at 
io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:194) at 
io.netty.buffer.PoolArena.allocate(PoolArena.java:136) at 
io.netty.buffer.PoolArena.allocate(PoolArena.java:126) at 
io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:397)
 at io.netty.buffer.AbstractByteBufAllocator.directBu
 ffer(AbstractByteBufAllocator.java:188) at 
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
 at 
org.apache.uniffle.common.netty.protocol.Decoders.decodeShuffleBlockInfo(Decoders.java:50)
 at 
org.apache.uniffle.common.netty.protocol.SendShuffleDataRequest.decodePartitionData(SendShuffleDataRequest.java:95)
 at 
org.apache.uniffle.common.netty.protocol.SendShuffleDataRequest.decode(SendShuffleDataRequest.java:107)
 at org.apache.uniffle.common.netty.protocol.Message.decode(Message.java:145) 
at 
org.apache.uniffle.common.netty.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:77)
   > > We can see that each time an out-of-direct-memory error occurs, it is 
caused by the code 
`org.apache.uniffle.common.netty.protocol.Decoders.decodeShuffleBlockInfo(Decoders.java:50)`,
 which is `ByteBuf data = 
NettyUtils.getNettyBufferAllocator().directBuffer(dataLength)`. This is the 
most direct trigger for insufficient direct memory.
   > > Because when a large number of requests arrive simultaneously, there 
might be a brief period (before the `TransportFrameDecoder` has a chance to 
`release` the `ByteBuf`) during which the shuffle server has double the created 
`ByteBuf`. This means, for a very short time, the direct memory usage is 
doubled, which is extremely uncontrollable. That is why it is very easy to 
cause an out-of-direct-memory error without this PR.
   > > So, we need this PR anyway. It might slow down the flushing process a 
little bit, but the shuffle server will at least remain available during the 
whole stress test.
   > > From the results of my stress tests, there doesn't seem to be any impact 
on performance. In fact, it may even be faster, as it can speed up the decoding 
process by not reallocating new `ByteBufs` on the other hand. There have been 
no anomalies or performance issues caused by the slowing down of the flushing 
process. Eventually, all these buffers will be flushed, and all `ByteBufs` will 
be successfully released, with no memory leaks.
   > > PTAL @jerqi
   > 
   > Maybe we should modify our flush strategy, too. Now we will flush a larger 
reduce partition. But if the map partition contains a smaller reduce partition. 
The memory won't be released, too.
   
   Flushing strategy will be changed in [the final 
PR](https://github.com/apache/incubator-uniffle/pull/1519/files#diff-652f62a6100de94d5e938d3acba76d1e2de041841dee0acf493d2219cb05e98dR301).
   
![image](https://github.com/apache/incubator-uniffle/assets/13834479/a9840775-191b-4a3a-b96c-cdb6808cfd32)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to