kfaraz commented on issue #12692:
URL: https://github.com/apache/druid/issues/12692#issuecomment-1170869825

   Great find, @abhishekagarwal87 ! I agree with the fix for the expiration bug.
   
   > We should definitely fix the peons to ignore the contents belonging to a 
different task group.
   
   This would make sense but there is an issue. The peon needs to be aware of 
the contents to know if it can reserve a location or not. In other words, the 
instance of `StorageLocation` must be updated with all the files that exist so 
that it knows how much space it can allocate for new partition files.
   
   ---
   I thought of breaking up the `IntermediaryDataManager` into two parts, one 
which is used by the peon, solely to `addSegment()` and the other used by 
indexer/middle manager to perform discovery and cleanup. But there are two 
issues with this:
   - As before, `StorageLocation.reserve()` would not work correctly for peon 
   - `IntermediaryDataManager` is an extension point, so it would not be 
desirable to change this contract
   
   Also, there are some race conditions possible even outside the ones you 
mentioned. In particular, between two invocations of the discovery method, a 
task would have reserved space and exhausted the storage location which might 
cause another task to fail. But since the default discovery period is small 
enough (60s), it has not been an issue, I guess.
   
   ---
   **Proposal:**
   Expose a `reserve` API on the ShuffleResource (used by middle 
manager/indexer) and peons would call that to reserve a location.
   - eliminates race conditions as middle manager becomes the sole storage 
resource manager
   - peon logic would become much simpler
      - try to reserve using ShuffleResource API
      - if succeeds, add segment
      - if fails, try next location
      - does not need to keep track of anything else
      - does not perform cleanup or discovery
   - we can continue to use the same `IntermediaryDataManager` contract
      
   Cons:
   - new API call between peon and middle manager whenever pushing segment (not 
very frequent)
   - the new `reserve` API is relevant only to local storage, as deep storage 
need not perform a reserve functionality (could be a noop for deep storage)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to