Re: [PR] SOLR-17158 Terminate distributed processing quickly when query limit is reached [solr]

via GitHub Thu, 26 Sep 2024 09:11:18 -0700


gus-asf commented on code in PR #2666:
URL: https://github.com/apache/solr/pull/2666#discussion_r1777385925



##########
solr/core/src/java/org/apache/solr/handler/component/ParallelHttpShardHandler.java:
##########
@@ -41,57 +39,78 @@
 @NotThreadSafe
 public class ParallelHttpShardHandler extends HttpShardHandler {
 
+  @SuppressWarnings("unused")
   private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
 
   private final ExecutorService commExecutor;
 
+  /*
+   * Unlike the basic HttpShardHandler, this class allows us to exit submit 
before
+   * pending is incremented and the responseFutureMap is updated. If the 
runnables that

Review Comment:
   The concurrency fixes here and changes in HttpShardHandler come from me 
needing to implement this and being worried about the effects of limits 
interrupting/canceling requests. I felt compelled to ensure that canceling 
futures from within the call back (to facilitate a faster short circuit once we 
know the request is hitting a limit) wouldn't cause concurrency issues. ... so 
I put effort into understanding exactly what's going on and what the controls 
are. I found that (before this ticket) there was reliance on the orderly update 
of pending, a map of futures and the request queue object. Converting the map 
to a synchronized object didn't seem sufficient since the logic relies on 
coordination among objects... so essentially I had to make sure everything was 
going to handle a new short circuit point initiated by the callback and 
concurrency became "on-topic"... 
   
   At a high level this ticket works by letting the callbacks [signal to stop 
all other 
callbacks](https://github.com/apache/solr/pull/2666/files#diff-ce2580113ae9a9d50f308985229130f094868e83d35a8727c5683e6a2f3f44daR543)
 if one of them is failing and partial results are not desired. There's another 
opportunity to quit early in the take() method but that caused too many 
problems (rare test failures likely due to further introduced concurrency 
issues).
   
   I do suspect some of this could be simplified, particularly what I note 
here: 
https://github.com/apache/solr/pull/2666/files#diff-ce2580113ae9a9d50f308985229130f094868e83d35a8727c5683e6a2f3f44daR98
 
   
   But I decided to only guard and de-duplicate the existing logic. I'm leaving 
revising the logic to a later effort in order to control scope. 
   
   In the case of your class (which came in as a merge), this was the simplest 
thing I could think of (as in simple to understand, not simple/elegant). 
Basically we need to know that everything we scheduled has been attempted (fail 
or not) before we let the loop that takes responses exit or we might miss one, 
but we also have to ensure that a failure won't leave us looping forever. So 
the quick solution was to track what's started, and track what's been 
attempted, and make sure we don't quit waiting for responses unless they are 
equal.  Note that attempts may not result in a a future if 
`this.lbClient.requestAsync(lbReq);` fails, so the count of futures is not good 
enough.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] SOLR-17158 Terminate distributed processing quickly when query limit is reached [solr]

Reply via email to