[ 
https://issues.apache.org/jira/browse/SOLR-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975573#comment-15975573
 ] 

Mark Miller edited comment on SOLR-10525 at 4/19/17 9:39 PM:
-------------------------------------------------------------

bq.  looks like the same issue of multiple requests stacking.

The reason for SOLR-8702 by the way is to examine no stacking at all. The issue 
that reduced stacking was titled more like "reduce stacking" not eliminate it. 
To eliminate it, we would want a patch and to examine if the change is worth 
any slow down in recovery calls we might have. Right now we can get hammered by 
recovery calls and they should all be very, very fast and result in few or no 
stack ups. Previously you stacked up every request.

In other words, if you eliminate stacking completely, is a recovery request 
going to cost more than a tryLock and atomic integer increment. Cause in the a 
concurrent env, that is super fast.


was (Author: markrmil...@gmail.com):
bq.  looks like the same issue of multiple requests stacking.

The reason for SOLR-8702 by the way is to examine no stacking at all. The issue 
that reduced stacking was titled more like "reduce stacking" not eliminate it. 
To eliminate it, we would want a patch and to examine if the change is worth 
any slow down in recovery calls we might have. Right now we can get hammered by 
recovery calls and they should all be very, very fast and result in few or no 
stack ups. Previously you stacked up every request.

> Stacked recovery requests can interfere with one another
> --------------------------------------------------------
>
>                 Key: SOLR-10525
>                 URL: https://issues.apache.org/jira/browse/SOLR-10525
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Mike Drob
>         Attachments: SOLR-10525.patch
>
>
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/update/DefaultSolrCoreState.java#L300-L310
> Two issues with this code:
> {code}
>           boolean locked = recoveryLock.tryLock();
>           try {
>             if (!locked) {
>               if (recoveryWaiting.get() > 0) { // line 1
>                 return;
>               }
>               recoveryWaiting.incrementAndGet(); // line 2
>             } else {
>               recoveryWaiting.incrementAndGet();
>               cancelRecovery(); // line 3
> }
> {code}
> The {{cancelRecovery}} on line 3 call will only hit when there are no 
> recoveries to actually cancel (since we got the lock that means there are no 
> recoveries in progress). Instead it should be moved either to the either 
> branch of the if, or outside after the if since we know we will be running a 
> recovery at that point.
> This code doesn't always prevent multiple requests from stacking. If there is 
> a recovery running, but no recoveries currently waiting, multiple requests 
> can check the count at line 1 before any of them will increment the count at 
> line 2 and thus all of them will hit the increment.
> I don't have specific tests for this, but it's causing failures for me on my 
> SOLR-9555 work in progress.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to