leixm commented on issue #198: URL: https://github.com/apache/incubator-uniffle/issues/198#issuecomment-1246378460
> > We can increase the rpc timeout, and find the reason for the slow response of the shuffle server. > > It's not a fundamental way of increasing the rpc timeout. And I found the rpc has been accepted by shuffle server(handled log has been shown), maybe it's stucked on sending. We had a similar rpc timeout exception before, and the task of running 10T data will appear. The investigation found that it was because the inflush_memory and used_memory were too high, which caused the client to frequently retry to send data and apply for buffer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@uniffle.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org