pmcgleenon commented on issue #17721: URL: https://github.com/apache/datafusion/issues/17721#issuecomment-3476400608
Now that there are multiple instance types for datafusion, I've created some [automation to run the clickbench tests](https://github.com/pmcgleenon/datafusion-clickbench-runner). If anyone would like to try it out, I would be interested in your feedback! Originally the ClickBench datafusion tests only used 1 instance type: `c6a.4xlarge` Recently the ClickHouse team added new instance types into the results for datafusion [c6a.xlarge](https://github.com/ClickHouse/ClickBench/commit/8ab83ad9c1745f71edd1105426babe32d41a7be8), [c6a.2xlarge](https://github.com/ClickHouse/ClickBench/commit/c11483588a72b3943971e0550a62e689a138cc23) and [c8g.4xlarge](https://github.com/ClickHouse/ClickBench/commit/b247b2045583558412fff8c67d68fae0765ad71d) The instance type [`c6a.xlarge`](https://instances.vantage.sh/aws/ec2/c6a.xlarge) only has 8GB RAM (the clickbench dataset is 15GB) and there are a number of issues with it, including - datafusion compilation fails on this instance. We can workaround this by using `brew install` - OOM errors happens during test execution, with results reported as null for several tests - the machine because unresponsive, with ssh and shell not working during test execution I see two options here: 1. report results for the `c6a.xlarge` and use the brew install workaround to get around the compilation issues. In this case some of the results will be null. I didn't see a way to specify a particular datafusion version with brew install, so it will always pick up the latest version (currently 50.3.0) 2. remove `c6a.xlarge` from the results until datafusion becomes functional on the 8GB RAM machine. We would have the option to compile datafusion with `target-cpu=native` to squeeze out some more performance I think we should avoid reporting results for `c6a.xlarge` until the issues are resolved, particularly since there is such a negative impact on the server when running the tests @alamb @Dandandan and everyone else - interested in your opinion on this on the way forward here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
