gianm commented on issue #12958:
URL: https://github.com/apache/druid/issues/12958#issuecomment-1227904609

   The approach makes sense to me. There would be a couple of limitations:
   
   - We would have to scan the entire dataset (everything matching the filter), 
since unlike with time-ordered scan, the data is not already pre-ordered.
   - We would need to require a limit be set to some reasonable value. If the 
limit is too high then servers would run out of memory.
   
   Btw, the code added in #12848 (SuperSorter) and #12918 (multi-stage query 
task) actually does support Scan with ORDER BY and disk-based sorting (so limit 
can be very high, restrained only by available disk space). But the approach 
you describe would be better for smaller limits, since it's all in memory, and 
would be more efficient.
   
   Would you be interested in implementing this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to