bruno-roustant commented on code in PR #1155:
URL: https://github.com/apache/solr/pull/1155#discussion_r1043219211


##########
solr/core/src/java/org/apache/solr/core/TransientSolrCoreCacheDefault.java:
##########
@@ -70,10 +80,15 @@ public TransientSolrCoreCacheDefault(CoreContainer 
coreContainer) {
     Caffeine<String, SolrCore> transientCoresCacheBuilder =
         Caffeine.newBuilder()
             .initialCapacity(initialCapacity)
-            // Use the current thread to queue evicted cores for closing. This 
ensures the
-            // cache max size is respected (with a different thread the max 
size would be
-            // respected asynchronously only eventually).
-            .executor(Runnable::run)
+            // Do NOT use the current thread to queue evicted cores for 
closing. Although this
+            // ensures the cache max size is respected only eventually, it 
should be a very
+            // short period of time until the cache maintenance task kicks in.
+            // The onEvict method needs the writeLock from SolrCores to 
operate safely.
+            // However, the current thread most probably has acquired a 
read-lock already
+            // somewhere up the call stack and would deadlock.
+            // Note that Caffeine cache has an internal maintenance task 
rescheduling logic which
+            // explicitly checks on the common pool.
+            .executor(ForkJoinPool.commonPool())

Review Comment:
   @ben-manes we had a [discussion](https://github.com/apache/solr/pull/580) on 
this subject some time ago. You told us about a way to pin entries. But in our 
case here, the is-in-use status of an entry is transient, we cannot change the 
weight of many entries each time they are used.
   So we decided to un-evict (put back) the entry to the cache if it is in-use. 
But we see now the limit of the approach when we have to manage a read lock and 
a write lock, and a different thread to put back.
   
   The solution I see would be to have an eviction approver (hard no) before 
the entry is selected for eviction (so we prefer to not evict rather than 
putting back). In this case we accept the O(n) scan for the entry selection, 
and we also accept that the cache size exceeds the limit, knowing that Caffeine 
maintenance tasks will kick periodically to try to evict entries again later. 
Would that be a possible option for Caffeine?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to