alamb commented on PR #17275:
URL: https://github.com/apache/datafusion/pull/17275#issuecomment-3271688853

   > I found a potential performance regression with `parquet 56.1.0`. Now more 
data pages will be returned if their size is less than the execution batch 
size. For example:
   
   Thanks @nuno-faria  -- this is a great find. @XiangpengHao and I purposely 
added a setting that allows disabling the cache for precisely this reason
   
   So what I think is needed is here is a way to turn this setting off via a 
DataFusion setting as well, which is what I was trying to say with
   
   >  . Add new Parquet option to control the size of the predicate cache
   
   
   Let me give this a try and see if we can get it working better
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to