lwllvyb commented on PR #2106:
URL: 
https://github.com/apache/incubator-uniffle/pull/2106#issuecomment-2340020080

   I add some debug log to veryfy if it has duplicated blockId.
   
![image](https://github.com/user-attachments/assets/c8475c2d-bf06-449e-a0fb-8ce7fa48af84)
   
   And i got some logs:
   
   ```
   [2024-09-09 23:51:03.422] [epollEventLoopGroup-3-29] [WARN] 
AbstractShuffleBuffer - append partitionId=8920 blockId=37413336496 is 
duplicated, prev block=ShufflePartitionedBlock{blockId[37413336496], 
length[11896], size[11928], uncompressLength[21560], crc[2412607028], 
taskAttemptId[144816]}
   [2024-09-09 23:51:03.543] [Grpc-140] [ERROR] ShuffleServerGrpcService - 
Error happened when get shuffle result for 
appId[application_1703049085550_20299615_1725895467752], shuffleId[0], 
partitions[8920]
   org.apache.uniffle.common.exception.RssException: Inconsistent block number 
for partitions: [8920]. Excepted: 43995, actual: 43996
           at 
org.apache.uniffle.server.ShuffleTaskManager.getFinishedBlockIds(ShuffleTaskManager.java:656)
           at 
org.apache.uniffle.server.ShuffleServerGrpcService.getShuffleResultForMultiPart(ShuffleServerGrpcService.java:939)
           at 
org.apache.uniffle.proto.ShuffleServerGrpc$MethodHandlers.invoke(ShuffleServerGrpc.java:1056)
           at 
io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
           at 
io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
           at 
io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
           at 
io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
           at 
org.apache.uniffle.common.rpc.ClientContextServerInterceptor$1.onHalfClose(ClientContextServerInterceptor.java:63)
           at 
io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
           at 
io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
           at 
io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:356)
           at 
io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:861)
           at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
           at 
io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:750)
   ```
   From these logs, the cause might be that duplicate blockId is added to the 
same buffer in bufferPool, and the previously added block with the same blockId 
is replaced, resulting in an untraceable memory leal.
   
   @zuston 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to