karlovnv opened a new issue, #10433:
URL: https://github.com/apache/datafusion/issues/10433

   ### Is your feature request related to a problem or challenge?
   
   Consider we have huge data source consists of many record batches.
   Now it's impossible to get last recent N rows without full scan:
   
   ``` sql
   SELECT * FROM Events
   ORDER BE event_time DESC
   LIMIT 1000
   ```
   
   The query above will do full scan from the starting row, but TableProvider 
may know that it needed to provide only last record batches (or latest parquet 
files in folder).
   
   ### Describe the solution you'd like
   
   Now we have filter and limit in TableProvider::scan:
   
   ``` rust
   async fn scan(
           &self,
           state: &SessionState,
           projection: Option<&Vec<usize>>,
           // filters and limit can be used here to inject some push-down 
operations if needed
           filters: &[Expr],
           limit: Option<usize>,
       ) -> Result<Arc<dyn ExecutionPlan>> {
   ```
   
   Let's add SortExpression as well to push it down or just consider:
   
   ``` rust
   async fn scan(
           &self,
           state: &SessionState,
           projection: Option<&Vec<usize>>,
           // filters and limit can be used here to inject some push-down 
operations if needed
           filters: &[Expr],
           // sort expression
           sorting: &[Expr],
           limit: Option<usize>,
       ) -> Result<Arc<dyn ExecutionPlan>> {
   ```
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to