alamb opened a new issue #467: URL: https://github.com/apache/arrow-datafusion/issues/467
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** As someone new to datafusion it may not be clear that to run the tests successfully you need to set `PARQUET_TEST_DATA` and `ARROW_TEST_DATA` environment variables So today, here is what happens: ``` git clone https://github.com/apache/arrow-datafusion cd arrow-datafusion cargo test -p datafusion ``` Which results in many errors like: ``` ---- physical_plan::windows::tests::window_function_input_partition stdout ---- thread 'physical_plan::windows::tests::window_function_input_partition' panicked at 'failed to get arrow data dir: env `ARROW_TEST_DATA` is undefined or has empty value, and the pre-defined data dir `/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-4.2.0/../testing/data` not found HINT: try running `git submodule update --init`', /Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-4.2.0/src/util/test_util.rs:81:21 ``` And even when you do as suggested `git submodule update --init` it does not work. Instead, you need to set : ``` export ARROW_TEST_DATA=testing/data export PARQUET_TEST_DATA=parquet-testing/data cargo test -p datafusion ``` **Describe the solution you'd like** I would like the tests to automatically try the default locations, as above, if `ARROW_TEST_DATA` and `PARQUET_TEST_DATA` are set. The tests should pass successfully with only these commands: ``` git clone https://github.com/apache/arrow-datafusion cd arrow-datafusion git submodule update --init cargo test -p datafusion ``` The arrow-rs crate already does this ([here](https://github.com/apache/arrow-rs/blob/master/arrow/src/util/test_util.rs#L100) and [here](https://github.com/apache/arrow-rs/blob/master/arrow/src/util/test_util.rs#L78`): but now that we no longer have arrow-rs and datafusion in the same workspace it stopped working Perhaps we can simply take the code from arrow-rs and port it to run in datafusion rather than calling arrow::util::test_util **Describe alternatives you've considered** None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
