zhuqi-lucas commented on issue #16149: URL: https://github.com/apache/datafusion/issues/16149#issuecomment-2903175289
> A fun experiment might be to "fix" the clickbench partitioned dataset by resorting and writing with page indexes (could use a bunch of DataFusion COPY commands pretty easily to do this). The sort order should be some subset of the predicate columns. Perhaps EventTime and then maybe SearchPhrase / URL. disabling compression This is very interesting, maybe we can also do this for arrow-rs clickbench benchmark to see the result. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org