dimas-b commented on code in PR #490:
URL: https://github.com/apache/polaris/pull/490#discussion_r1964849526
##########
polaris-core/src/main/java/org/apache/polaris/core/persistence/cache/EntityCache.java:
##########
@@ -71,12 +71,13 @@ public EntityCache(@NotNull PolarisRemoteCache
polarisRemoteCache) {
};
// use a Caffeine cache to purge entries when those have not been used for
a long time.
- // Assuming 1KB per entry, 100K entries is about 100MB.
this.byId =
Caffeine.newBuilder()
- .maximumSize(100_000) // Set maximum size to 100,000 elements
+ .maximumWeight(100 * EntityWeigher.WEIGHT_PER_MB) // Goal is ~100MB
+ .weigher(EntityWeigher.asWeigher())
.expireAfterAccess(1, TimeUnit.HOURS) // Expire entries after 1
hour of no access
.removalListener(removalListener) // Set the removal listener
+ .softValues() // Account for memory pressure
Review Comment:
GC behaviour with soft references is very GC impl.-specific, AFAIK. A
substantial amount of soft-referenced objects can still cause huge GC overhead
(in particular with Parallel GC in my experience). This can easily happen if
the weighter is underestimating.
On the other hand, if the weighter is overestimating, that reduces the cache
efficiency and wastes memory (allotted for cache, but not actually used for
cache).
Yet, if the weighter is accurate and the cache stays within the allotted
boundaries, why would we want to use soft-references?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]