devinjdangelo commented on issue #5347: URL: https://github.com/apache/arrow-datafusion/issues/5347#issuecomment-1970986598
The reason these tests lock up is very high memory utilization to run them in parallel, which is cargo's default behavior. My system peaked at over 100GB of memory utilization :exploding_head: ! I took a look through the dataframe doc tests, and I don't see any inherent reason for such extreme memory usage. I believe @Jefffrey is correct that the cause is rust loading many multiples of a large debug binary into memory. I think it would be a reasonable workaround to improve the developer experience to find a way to default cargo to run these specific tests with a maximum parallelism of somewhere in the 1-4 range which should work on most systems. You can do this manually by running `cargo test --doc dataframe -- --test-threads 1` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
