rickyma commented on PR #1679:
URL: 
https://github.com/apache/incubator-uniffle/pull/1679#issuecomment-2099998948

   > Yes. But I think this will happen when AppPurgeEvent is triggered and 
ShufflePurgeEvent is not triggered at the same time, maybe this is rare.
   
   ShufflePurgeEvent will only be created when an unregister shuffle request is 
received.
   So I think it will help a lot in my case. This is very common in my env. 
Because Spark jobs might be killed by users, which will cause the threads 
interrupted and fail to continue to send unregister shuffle requests to shuffle 
servers. The stack trace is something like:
   ```
   [21:57:36:865] [unregister-shuffle-0] WARN  
org.apache.uniffle.client.impl.ShuffleWriteClientImpl.lambda$unregisterShuffle$28:1021
 - Error happened when unregistering to 
ShuffleServerInfo{host[123.123.123.123], grpc port[12345], netty port[17000]}
   io.grpc.StatusRuntimeException: CANCELLED: Thread interrupted
           at 
io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268) 
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249) 
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167) 
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at 
org.apache.uniffle.proto.ShuffleServerGrpc$ShuffleServerBlockingStub.unregisterShuffleByAppId(ShuffleServerGrpc.java:772)
 ~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at 
org.apache.uniffle.client.impl.grpc.ShuffleServerGrpcClient.doUnregisterShuffleByAppId(ShuffleServerGrpcClient.java:337)
 ~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at 
org.apache.uniffle.client.impl.grpc.ShuffleServerGrpcClient.unregisterShuffleByAppId(ShuffleServerGrpcClient.java:344)
 ~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at 
org.apache.uniffle.client.impl.ShuffleWriteClientImpl.lambda$unregisterShuffle$28(ShuffleWriteClientImpl.java:1016)
 ~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at 
org.apache.uniffle.common.util.ThreadUtils.lambda$null$0(ThreadUtils.java:110) 
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_352]
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_352]
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_352]
           at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_352]
   Caused by: java.lang.InterruptedException
           at 
io.grpc.stub.ClientCalls$ThreadlessExecutor.throwIfInterrupted(ClientCalls.java:750)
 ~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at 
io.grpc.stub.ClientCalls$ThreadlessExecutor.waitAndDrain(ClientCalls.java:711) 
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:159) 
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
           ... 9 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to