Ngone51 commented on code in PR #36088:
URL: https://github.com/apache/spark/pull/36088#discussion_r847255904
##########
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java:
##########
@@ -118,11 +119,18 @@ public ShuffleIndexInformation load(String filePath)
throws IOException {
return new ShuffleIndexInformation(filePath);
}
};
- shuffleIndexCache = CacheBuilder.newBuilder()
- .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
- .weigher((Weigher<String, ShuffleIndexInformation>)
- (filePath, indexInfo) -> indexInfo.getRetainedMemorySize())
- .build(indexCacheLoader);
+ CacheBuilder cacheBuilder = CacheBuilder.newBuilder()
+ .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
+ .weigher((Weigher<String, ShuffleIndexInformation>)
+ (filePath, indexInfo) -> indexInfo.getRetainedMemorySize());
+ int expireTimeSeconds = conf.shuffleIndexCacheExpireTimeSeconds();
+ if (expireTimeSeconds > 0) {
+ shuffleIndexCache = cacheBuilder.expireAfterAccess(expireTimeSeconds,
TimeUnit.SECONDS)
Review Comment:
Using the timeout threshold doesn't seem to be a friendly config for users.
I don't think users could think of a reasonable timeout value to clean up the
cache. In the worst case, they could result in regression if they set an
arbitrary value, e.g., a very small value.
The only case that I think we can remove the shuffle cache now is when
`applicationRemoved` is called. In that case, I think we could remove the
shuffle cache safely, which could mitigate the issue you mentioned. However, I
didn't see guava cache has such an API to remove unused cache entries.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]