alamb commented on issue #21733:
URL: https://github.com/apache/datafusion/issues/21733#issuecomment-4281359699

   I think this idea of "heuristically choose the order of files to scan to try 
and maximize dynamic filter efficiency" is a really neat one. 
   
   
   So where I am heading is that it would be anice to have some sort of generic 
API like "reorder_files_heuristically" in the FileStream / shared work queue, 
rather than hard code in the sortedness heuristic.
   
   I realize the topk / sorting is probably the most imporatant one, but there 
may be others. Also I think my setting up a reasonable API will help keep the 
code structure easier to understand
   
   THank you for working on this @zhuqi-lucas 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to