rickyma commented on PR #1679:
URL:
https://github.com/apache/incubator-uniffle/pull/1679#issuecomment-2099998948
> Yes. But I think this will happen when AppPurgeEvent is triggered and
ShufflePurgeEvent is not triggered at the same time, maybe this is rare.
ShufflePurgeEvent will only be created when an unregister shuffle request is
received.
So I think it will help a lot in my case. This is very common in my env.
Because Spark jobs might be killed by users, which will cause the threads
interrupted and fail to continue to send unregister shuffle requests to shuffle
servers. The stack trace is something like:
```
[21:57:36:865] [unregister-shuffle-0] WARN
org.apache.uniffle.client.impl.ShuffleWriteClientImpl.lambda$unregisterShuffle$28:1021
- Error happened when unregistering to
ShuffleServerInfo{host[123.123.123.123], grpc port[12345], netty port[17000]}
io.grpc.StatusRuntimeException: CANCELLED: Thread interrupted
at
io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at
org.apache.uniffle.proto.ShuffleServerGrpc$ShuffleServerBlockingStub.unregisterShuffleByAppId(ShuffleServerGrpc.java:772)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at
org.apache.uniffle.client.impl.grpc.ShuffleServerGrpcClient.doUnregisterShuffleByAppId(ShuffleServerGrpcClient.java:337)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at
org.apache.uniffle.client.impl.grpc.ShuffleServerGrpcClient.unregisterShuffleByAppId(ShuffleServerGrpcClient.java:344)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at
org.apache.uniffle.client.impl.ShuffleWriteClientImpl.lambda$unregisterShuffle$28(ShuffleWriteClientImpl.java:1016)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at
org.apache.uniffle.common.util.ThreadUtils.lambda$null$0(ThreadUtils.java:110)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_352]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[?:1.8.0_352]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[?:1.8.0_352]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_352]
Caused by: java.lang.InterruptedException
at
io.grpc.stub.ClientCalls$ThreadlessExecutor.throwIfInterrupted(ClientCalls.java:750)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at
io.grpc.stub.ClientCalls$ThreadlessExecutor.waitAndDrain(ClientCalls.java:711)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:159)
~[rss-client-spark3-shaded-0.9.0-SNAPSHOT.jar:?]
... 9 more
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]