attilapiros commented on pull request #35559: URL: https://github.com/apache/spark/pull/35559#issuecomment-1044112522
> Thank you very much for tackling this. It's been a while since I looked at it. I'm unsure why your number is 10 times larger than mine though. The reason must be that I have focused on the leak suspect `LocalCache$Segment` where both the key (`java.io.File` ~ 960 bytes because of storing the file path) and value is stored (`ShuffleIndexInformation` ~ 160 bytes in the pic). <img width="851" alt="image" src="https://user-images.githubusercontent.com/2017933/154637434-d1296105-bd56-4ed7-a1ae-83f2059eac35.png"> Both solution would work. In my case we have a stronger limit for full cache. But look at the Weigher interface: https://guava.dev/releases/18.0/api/docs/com/google/common/cache/Weigher.html It gets the `Key` too and the description mentions cache entry and not only the value: > Returns the weight of a cache entry. There is no unit for entry weights; rather they are simply relative to each other. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
