gus-asf commented on code in PR #2666:
URL: https://github.com/apache/solr/pull/2666#discussion_r1777385925
##########
solr/core/src/java/org/apache/solr/handler/component/ParallelHttpShardHandler.java:
##########
@@ -41,57 +39,78 @@
@NotThreadSafe
public class ParallelHttpShardHandler extends HttpShardHandler {
+ @SuppressWarnings("unused")
private static final Logger log =
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
private final ExecutorService commExecutor;
+ /*
+ * Unlike the basic HttpShardHandler, this class allows us to exit submit
before
+ * pending is incremented and the responseFutureMap is updated. If the
runnables that
Review Comment:
The concurrency fixes here and changes in HttpShardHandler come from me
needing to implement this and being worried about the effects of limits
interrupting/canceling requests. I felt compelled to ensure that canceling
futures from within the call back (to facilitate a faster short circuit once we
know the request is hitting a limit) wouldn't cause concurrency issues. ... so
I put effort into understanding exactly what's going on and what the controls
are. I found that (before this ticket) there was reliance on the orderly update
of pending, a map of futures and the request queue object. Converting the map
to a synchronized object didn't seem sufficient since the logic relies on
coordination among objects... so essentially I had to make sure everything was
going to handle a new short circuit point initiated by the callback and
concurrency became "on-topic"...
At a high level this ticket works by letting the callbacks [signal to stop
all other
callbacks](https://github.com/apache/solr/pull/2666/files#diff-ce2580113ae9a9d50f308985229130f094868e83d35a8727c5683e6a2f3f44daR543)
if one of them is failing and partial results are not desired. There's another
opportunity to quit early in the take() method but that caused too many
problems (rare test failures likely due to further introduced concurrency
issues).
I do suspect some of this could be simplified, particularly what I note
here:
https://github.com/apache/solr/pull/2666/files#diff-ce2580113ae9a9d50f308985229130f094868e83d35a8727c5683e6a2f3f44daR98
But I decided to only guard and de-duplicate the existing logic. I'm leaving
revising the logic to a later effort in order to control scope.
In the case of your class (which came in as a merge), this was the simplest
thing I could think of (as in simple to understand, not simple/elegant).
Basically we need to know that everything we scheduled has been attempted (fail
or not) before we let the loop that takes responses exit or we might miss one,
but we also have to ensure that a failure won't leave us looping forever. So
the quick solution was to track what's started, and track what's been
attempted, and make sure we don't quit waiting for responses unless they are
equal. Note that attempts may not result in a a future if
`this.lbClient.requestAsync(lbReq);` fails, so the count of futures is not good
enough.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]