clintropolis opened a new pull request, #18683:
URL: https://github.com/apache/druid/pull/18683

   ### Description
   This PR makes some behavioral changes to how historical servers manage their 
cache files when operating in the 'virtual storage' mode added in #18176, to 
provide a better experience and more flexibility to callers.
   
   The main change is that in virtual storage mode, it is now impossible for a 
segment to be 'missing' in the `SegmentManager` if a caller has a 
`DataSegment`. By 'missing' segments, I am referring to the specific Druid 
response a historical server provides in the query response context to cover 
the case of a segment drop command occurring before the query engine has a 
chance to obtain a reference to the files (or even before ever receiving the 
request), so that the broker can retry these requests to the new location the 
segment has been moved to (if retries are enabled the the segment is available 
elsewhere).
   
   Now, in virtual storage mode, the historical will simply download the 
segment on demand and process it for any `DataSegment` object. So for example, 
when using `ServerManager` to get segments, this means that after resolving the 
set of `DataSegments` participating in a query from the timeline, the query 
engine can always obtain a `SegmentReference` so long as there is adequate disk 
space to download (and it is present in deep storage of course).
   
   To complement this behavior, the segment cache itself no longer actively 
unmounts weakly held segment cache entries, making the coordinator drop command 
a near no-op. Instead, files will remain on disk until they are evicted because 
some other segment needs to reclaim the space so that it can be loaded. This 
does mean that the disk will 'fill up' over time, but is worth the benefits. To 
accommodate this and ensure that segment files are tracked across process 
boundaries (in the event of a failure or whatever), segment info files are now 
deleted via an 'unmount' hook on the segment files cache entry, meaning that 
they will not be removed until the segment files are also removed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to