sundy-li commented on issue #5404:
URL: 
https://github.com/apache/arrow-datafusion/issues/5404#issuecomment-1445319733

   TopK is a partial factor.
   
   1.  Lazy projection(aka Later projection) can improve this case, we just 
fetch `URL` column at the first query and apply the order limit then projection 
other columns by rowids.
   2.  `URL` is a large binary column in the hits dataset, duckdb optimized 
reading parquet to it's memory model. You can prove that by `select max(URL) 
from table`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to