leixm commented on PR #2975:
URL: https://github.com/apache/celeborn/pull/2975#issuecomment-2514197789

   ```
   24/11/27 01:44:36,396 WARN [push-server-6-14] TransportChannelHandler: 
Exception in connection from /10.217.150.42:27112
   io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 4194304 
byte(s) of direct memory (used: 10733223943, max: 10737418240)
           at 
io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:843)
           at 
io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:772)
           at 
io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:717)
           at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:692)
           at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:215)
           at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:197)
           at io.netty.buffer.PoolArena.allocate(PoolArena.java:139)
           at io.netty.buffer.PoolArena.allocate(PoolArena.java:129)
           at 
io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:395)
           at 
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
           at 
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
           at 
io.netty.buffer.CompositeByteBuf.allocBuffer(CompositeByteBuf.java:1879)
           at 
io.netty.buffer.CompositeByteBuf.consolidate0(CompositeByteBuf.java:1758)
           at 
io.netty.buffer.CompositeByteBuf.consolidateIfNeeded(CompositeByteBuf.java:571)
           at 
io.netty.buffer.CompositeByteBuf.addComponent(CompositeByteBuf.java:266)
           at 
io.netty.buffer.CompositeByteBuf.addComponent(CompositeByteBuf.java:222)
           at 
org.apache.celeborn.service.deploy.worker.storage.FileWriter.write(FileWriter.java:224)
           at 
org.apache.celeborn.service.deploy.worker.PushDataHandler.writeData$1(PushDataHandler.scala:1235)
           at 
org.apache.celeborn.service.deploy.worker.PushDataHandler.writeLocalData(PushDataHandler.scala:1278)
           at 
org.apache.celeborn.service.deploy.worker.PushDataHandler.handlePushMergedData(PushDataHandler.scala:636)
           at 
org.apache.celeborn.service.deploy.worker.PushDataHandler.$anonfun$receive$2(PushDataHandler.scala:146)
           at 
org.apache.celeborn.service.deploy.worker.PushDataHandler.handleCore(PushDataHandler.scala:743)
           at 
org.apache.celeborn.service.deploy.worker.PushDataHandler.receive(PushDataHandler.scala:147)
           at 
org.apache.celeborn.common.network.server.TransportRequestHandler.processOtherMessages(TransportRequestHandler.java:132)
           at 
org.apache.celeborn.common.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:88)
           at 
org.apache.celeborn.common.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:151)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
           at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
           at 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
           at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
           at 
org.apache.celeborn.common.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:74)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
           at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
           at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
           at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
           at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
           at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
           at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
           at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
           at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
           at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
           at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
           at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
           at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
           at java.lang.Thread.run(Thread.java:748)
   ```
   
   
   ```
   24/11/27 03:05:19,880 WARN [worker-memory-manager-actor] StorageManager: 
Skip flushOnMemoryPressure because LocalFlusher@93798665-/mnt/ssd/0 has error: 
Wait pending actions timeout.
   24/11/27 03:05:19,881 WARN [worker-memory-manager-actor] StorageManager: 
Skip flushOnMemoryPressure because LocalFlusher@93798665-/mnt/ssd/0 has error: 
Wait pending actions timeout.
   24/11/27 03:05:19,881 WARN [worker-memory-manager-actor] StorageManager: 
Skip flushOnMemoryPressure because LocalFlusher@93798665-/mnt/ssd/0 has error: 
Wait pending actions timeout.
   24/11/27 03:05:19,881 WARN [worker-memory-manager-actor] StorageManager: 
Skip flushOnMemoryPressure because LocalFlusher@93798665-/mnt/ssd/0 has error: 
Wait pending actions timeout.
   24/11/27 03:05:19,881 WARN [worker-memory-manager-actor] StorageManager: 
Skip flushOnMemoryPressure because LocalFlusher@93798665-/mnt/ssd/0 has error: 
Wait pending actions timeout.
   24/11/27 03:05:19,881 WARN [worker-memory-manager-actor] StorageManager: 
Skip flushOnMemoryPressure because LocalFlusher@93798665-/mnt/ssd/0 has error: 
Wait pending actions timeout
   ```
   
   
![image](https://github.com/user-attachments/assets/3b5d8ec0-e1a4-4238-b08d-a418448893c0)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to