advancedxy commented on code in PR #428:
URL: https://github.com/apache/incubator-uniffle/pull/428#discussion_r1050450133
##########
server/src/main/java/org/apache/uniffle/server/ShuffleServerGrpcService.java:
##########
@@ -226,26 +227,21 @@ public void sendShuffleData(SendShuffleDataRequest req,
return;
}
final long start = System.currentTimeMillis();
+ manager.markBufferInUse(requireBufferId);
Review Comment:
> In current codebase, it is extremely rare too.
I don't think so. I noticed this issue in one of our prod servers, and it
probably already happened in other servers.
@zuston do you have any other data for this issue.
In master codebase, the preAllocatedSize is decreased multiple times during
`sendShuffleData` request, anythings goes wrong in one request
`manage.cacheShuffleData`, the preAllocatedSize would be wrong.
> It will lead to OOM
In theory, it would. But if we are considering the fact this happens
extremely rare, and we are starting flushing upon high watermark, it should be
fine?
> The gc time of our server will be a little long. Will it increase the
probability of this problem?
Quite possible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]