korowa commented on PR #11627:
URL: https://github.com/apache/datafusion/pull/11627#issuecomment-2254539608

   > 1000 partitions
   
   @alamb this is also a bit unexpected, since default value of rows to fire 
check after is 100_000 and its applied per partition (each partition is going 
to process at least 100k rows normally, without skipping aggregation), and the 
total number of rows in the file ~100kk (if I'm not mistaken). So this 
optimization should not benefit in this case, as in case of 1000 partitions 
each partition will read ~100_000 rows anyway :thinking: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to