attilapiros commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r812188973
##########
File path:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws
IOException {
return new ShuffleIndexInformation(file);
}
};
+
+ // SPARK-33206: weightCorrection is a constant to increase the shuffle
index weight for the
+ // index cache to avoid underestimating the retained memory size coming
from the bookkeeping
+ // objects in case of very small index files (i.e files with two offsets
are only 16 bytes
+ // in size but related bookkeeping objects retained memory is 1192 bytes)
otherwise we can
+ // easily cause an OOM in the NodeManager even when the default cache size
limit is used.
+ final int weightCorrection = 1176;
shuffleIndexCache = CacheBuilder.newBuilder()
.maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
- .weigher((Weigher<File, ShuffleIndexInformation>) (file, indexInfo) ->
indexInfo.getSize())
+ .weigher((Weigher<File, ShuffleIndexInformation>)
+ (file, indexInfo) -> indexInfo.getSize() + weightCorrection)
Review comment:
It does but soon expect much better solution where File is not used as a
key but the path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]