alamb commented on PR #13293: URL: https://github.com/apache/datafusion/pull/13293#issuecomment-2540053084
@suremarc if you are going to work on Statistics, here are some properties I think would be most useful: 1. Minimize the downstream API impact as much as possible (aka give downstream users a chance to adjust) 2. Ensure that `Statistics` are cheaply `clone`able (as it is, copying Statistics for tables with many strings shows up often in our profiles for short queries) It would be really great to consolidate the statistics aggregation code (e.g. that combines statistics across files) into a single struct / location (but that is a good follow on perhaps) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org