Mike Drob created SOLR-10525:
--------------------------------

             Summary: Stacked recovery requests can interfere with one another
                 Key: SOLR-10525
                 URL: https://issues.apache.org/jira/browse/SOLR-10525
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: SolrCloud
            Reporter: Mike Drob


https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/update/DefaultSolrCoreState.java#L300-L310

Two issues with this code:

{code}
          boolean locked = recoveryLock.tryLock();
          try {
            if (!locked) {
              if (recoveryWaiting.get() > 0) { // line 1
                return;
              }
              recoveryWaiting.incrementAndGet(); // line 2
            } else {
              recoveryWaiting.incrementAndGet();
              cancelRecovery(); // line 3
}
{code}

The {{cancelRecovery}} on line 3 call will only hit when there are no 
recoveries to actually cancel (since we got the lock that means there are no 
recoveries in progress). Instead it should be moved either to the either branch 
of the if, or outside after the if since we know we will be running a recovery 
at that point.

This code doesn't always prevent multiple requests from stacking. If there is a 
recovery running, but no recoveries currently waiting, multiple requests can 
check the count at line 1 before any of them will increment the count at line 2 
and thus all of them will hit the increment.

I don't have specific tests for this, but it's causing failures for me on my 
SOLR-9555 work in progress.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to