I think you're going to really have to break the encapsulation model to accomplish this in the RPC layer. What about updating the serialized executor for those situations to do a resubmission rather than blocking operation? Basically, it seems like we want a two phase termination: request termination and then confirm termination. It seems like both should be non-blocking....
The other option is to rethink the model around termination. It might be worth a hangout brainstorm to see if we can come up with ideas that are more outside of the box. -- Jacques Nadeau CTO and Co-Founder, Dremio On Fri, Apr 1, 2016 at 2:28 PM, Sudheesh Katkam <[email protected]> wrote: > Hey y’all, > > There are some blocking requests that could make an event loop *await > uninterruptibly*. At this point, the Drillbit might seem unresponsive. This > is worsened if the the event loop is not unblocked (due to a bug), which > requires a Drillbit restart. Although Drill supports *offloading from the > event loop* (experimental), this is not sufficient as the thread handling > the queue of requests would still block. > > AFAIK there are two such requests: > + when the user cancels > < > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java#L1184 > > > the query during planning > + a fragment is canceled > < > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java#L150 > > > or terminated > < > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java#L501 > > > early during setup > > I think a simple solution would be to *re-queue *such requests (possible in > above cases). That way other requests get their chance, and all requests > would be eventually handled. Thoughts? > > Thank you, > Sudheesh >
