alamb opened a new issue, #18473: URL: https://github.com/apache/datafusion/issues/18473
### Describe the bug @pmcgleenon reports on https://github.com/apache/datafusion/issues/17721#issuecomment-3476400608: > The instance type [c6a.xlarge](https://instances.vantage.sh/aws/ec2/c6a.xlarge) only has 8GB RAM (the clickbench dataset is 15GB) and there are a number of issues with it, including > > ... > OOM errors happens during test execution, with results reported as null for several tests ### To Reproduce 1. Use docker to build datafusion: ```shell cd .devcontainer docker build -t datafusion-build . cd .. docker run -m 4G -v `pwd`:/datafusion -it datafusion-build /bin/bash ``` Now, in the docker container ```shell # build datafusion-cli cd /datafusion cargo install --profile=release-nonlto --path datafusion-cli # Get benchmark data cd /datafusion/benchmarks /bench.sh data clickbench_partitioned cd /datafusion/benchmarks/data # make symlink to hits so queries can run without modification ln -s hits_partitioned hits # run the queries for q in `ls ../queries/clickbench/queries/*.sql` ; do datafusion-cli -f $q ; done ``` This is the loop that runs the queries ```shell for q in `ls ../queries/clickbench/queries/*.sql` ; do echo "Running $q..." ; datafusion-cli -f $q ; done ``` You'll see queries get killed due to OOM like this: ``` Running ../queries/clickbench/queries/q18.sql... DataFusion CLI v50.3.0 bash: line 1: 57537 Killed datafusion-cli -f $q ``` The queries that are killed are: > Running ../queries/clickbench/queries/q18.sql... > Running ../queries/clickbench/queries/q20.sql... > Running ../queries/clickbench/queries/q22.sql... > Running ../queries/clickbench/queries/q23.sql... > Running ../queries/clickbench/queries/q32.sql... > Running ../queries/clickbench/queries/q33.sql... > Running ../queries/clickbench/queries/q34.sql... > Running ../queries/clickbench/queries/q35.sql... You can find these queries here: https://github.com/apache/datafusion/tree/main/benchmarks/queries/clickbench/queries ### Expected behavior The queries should not be killed ### Additional context This ticket tracks build OOM'ing: - https://github.com/apache/datafusion/issues/18471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
