leixm commented on issue #198:
URL: 
https://github.com/apache/incubator-uniffle/issues/198#issuecomment-1246378460

   > > We can increase the rpc timeout, and find the reason for the slow 
response of the shuffle server.
   > 
   > It's not a fundamental way of increasing the rpc timeout. And I found the 
rpc has been accepted by shuffle server(handled log has been shown), maybe it's 
stucked on sending.
   
   We had a similar rpc timeout exception before, and the task of running 10T 
data will appear. The investigation found that it was because the 
inflush_memory and used_memory were too high, which caused the client to 
frequently retry to send data and apply for buffer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to