alamb opened a new pull request #8996: URL: https://github.com/apache/arrow/pull/8996
This PR is based on @mqy 's great work in: https://github.com/apache/arrow/pull/8967. (If we want to take this PR, we can either merge it in to https://github.com/apache/arrow/pull/8967/files# directly or I can make a new independent PR when that is merged). The outcome is that developers can now simply run `cargo test` in a typical checkout without having to mess with environment variables. I think this will lower the barrier to entry for people to contribute. The changes are: 1. Code from https://github.com/apache/arrow/pull/8967 to encode heuristics of where to check for test data 1. Remove all references to ARROW_TEST_DATA and PARQUET_TEST_DATA and uses the test_util methods instead 2. Update the comments / error messages in test_util ## Example Error Handling Error handling: here is what happens with a fresh checkout and no git modules checked out and no environment variables set: ``` cargo test -p arrow ---- ipc::reader::tests::read_decimal_be_file_should_panic stdout ---- thread 'ipc::reader::tests::read_decimal_be_file_should_panic' panicked at 'failed to get arrow data dir: env `ARROW_TEST_DATA` is undefined or has empty value, and the pre-defined data dir `/private/tmp/arrow/rust/arrow/../../testing/data` not found HINT: try running `git submodule update --init`', arrow/src/util/test_util.rs:81:21 ``` Here is an example of what happens when `ARROW_TEST_DATA` is pointing somewhere non existent ``` ARROW_TEST_DATA=blargh cargo test -p arrow ... --- ipc::reader::tests::read_decimal_be_file_should_panic stdout ---- thread 'ipc::reader::tests::read_decimal_be_file_should_panic' panicked at 'failed to get arrow data dir: the data dir `blargh` defined by env ARROW_TEST_DATA not found', arrow/src/util/test_util.rs:81:21 ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
