Rembrant777 commented on issue #402:
URL: 
https://github.com/apache/incubator-uniffle/issues/402#issuecomment-1351719092

   I suppose that the main reason caused this test case failure is 
`rss.server.app.expired.withoutHeartbeat`'s value is too small. 
   
   It can be re-produced by adding a breakpoint on the client side(block client 
response immediately), when the server side's 
`ShuffleTaskManager`#`expiredAppCleanupExecutorService` detects that time is up 
and the app gets expired it will invoke `removeResources` method to remove the 
application(appId[case1]). 
   
   When the client tries to fetch a partition/block data from an expired 
application it only receives an error message with the required data fields 
blank and finally throws an exception. 
   
   Here are the  error logs from the server side: 
   ```
   2022-12-14 23:56:45,238 INFO  [clearResourceThread] 
server.ShuffleTaskManager (ShuffleTaskManager.java:removeResources(477)) - 
Start remove resource for appId[case1]
   2022-12-14 23:56:45,238 ERROR [clearResourceThread] 
server.ShuffleTaskManager (ShuffleTaskManager.java:lambda$new$3(136)) - 
Exception happened when clear resource for expired application
   java.lang.NullPointerException
        at 
org.apache.uniffle.server.ShuffleTaskManager.removeResources(ShuffleTaskManager.java:479)
        at 
org.apache.uniffle.server.ShuffleTaskManager.lambda$new$3(ShuffleTaskManager.java:130)
        at java.lang.Thread.run(Thread.java:748)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to