clintropolis commented on PR #18176:
URL: https://github.com/apache/druid/pull/18176#issuecomment-3025121463

   >👍 Might be good to add a link to the original SIEVE paper: 
https://www.usenix.org/conference/nsdi24/presentation/zhang-yazhuo for those 
interested
   
   yea totally, I have it linked in javadocs for `StorageLocation` and was 
planning to add it to the PR description once I fill it in when this is closer 
to ready to review just haven't got to it yet 😅 (still changing quite a few 
things).
   
   >@clintropolis , I haven't gone through the PR yet but how will this affect 
segment assignment/balancing on the Coordinator?
   
   This first PR has no changes needed to the coordinator logic. When a 
historical is set to this mode, the idea is that you set `druid.server.maxSize` 
to how much data you want it to be responsible for, but set the sizes in 
`druid.segmentCache.locations` to be the actual disk sizes, and during querying 
the cache manager will load and drop segments internally as appropriate to stay 
within the constraints of the `druid.segmentCache.locations` sizes. I'll 
elaborate more in the PR description once this branch gets closer to review 
ready.
   
   I do have some ideas for follow-up work of adding a new type of "weak" load 
rule to allow historicals to use the same load logic it does for all segments 
when `druid.segmentCache.isVirtualStorageFabric` from this PR is set to true, 
but also still have regular segment loads. This would allow for finer grained 
control over how segments are loaded by allowing some segments to be sticky and 
always present in the disk cache (the cache manager supports this internally), 
while others would be weak references and load on demand but be eligible to be 
dropped if new strong or weak loads need the space. This likely does require 
some adjustments to coordinator balancing to distinguish 'weak' loads from 
regular loads to ensure the regular loads to exceed actual disk space, but I 
think the changes would be pretty minor. Was planning to address this too in 
the PR description (or maybe a linked design proposal issue, not sure, haven't 
decided yet).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to