Re: [PR] Fix deadlock that can occur while merging group by results (druid)

via GitHub Thu, 18 Jan 2024 03:28:40 -0800


LakshSingla commented on PR #15420:
URL: https://github.com/apache/druid/pull/15420#issuecomment-1898301692


   @gianm 
   > The new context keys add to complexity because there's no mechanism that 
ensures they are set at the appropriate places, so it's tough to understand how 
and when they're supposed to be set. They're kind of "magical".
   
   I think my use of keys has been exaggerated, and can probably be minimized 
if we have a unique key set on the context per query, which can be used to 
reference the already reserved resources from a globally available pool (with 
some reservation built-in. 
   
   With regards to cloning, I don't think there's a real need for it, apart 
from ensuring that the callers up the stack of query runners don't get a 
contaminated pool, which contains keys set by the children. I was envisioning 
this would be useful in case when there are multiple to the 
mergeResults->mergeRunners called by a parent runner, being called in parallel. 
The child runner won't have a chance to clean up the response context object, 
while it gets passed to the other runner.
   
   Regarding the above comments, I have come up with the following alternative:
   
   1. The `mergeResults` will have `willMergeRunners` method, which will get 
populated by the callers, and in the production code will only be set to false 
by the `CachingClusteredClient`. Note: Can this be done by the caching 
clustered client as well, in case the segment caching is enabled? (Probably so, 
but GroupByQuery ignores segment caching so, the semantics of the flag will be 
a bit different than actually demonstrating - "will it merge runners")
   2. Having a reservation pool like
   ```java
   GroupByResourcesReservationPool(){
       public void reserve(String uniqueId, int numMergeBuffers)
       public void reserve(String uniqueId, int numMergeBuffers, long timeoutMs)
       public GroupByQueryResources take(String uniqueId)
       public GroupByQueryResources cleanup(String uniqueId)
   }


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Fix deadlock that can occur while merging group by results (druid)

Reply via email to