alamb opened a new pull request #8996:
URL: https://github.com/apache/arrow/pull/8996


   This PR is based on @mqy 's great work in: 
https://github.com/apache/arrow/pull/8967. 
   
   (If we want to take this PR, we can either merge it in to 
https://github.com/apache/arrow/pull/8967/files# directly or I can make a new 
independent PR when that is merged).
   
   The outcome is that developers can now simply run  `cargo test` in a typical 
checkout without having to mess with environment variables. I think this will 
lower the barrier to entry for people to contribute. 
   
   The changes are:
   1. Code from https://github.com/apache/arrow/pull/8967 to encode heuristics 
of where to check for test data
   1. Remove all references to ARROW_TEST_DATA and PARQUET_TEST_DATA and uses 
the test_util methods instead
   2. Update the comments / error messages in test_util
   
   ## Example Error Handling
   
   Error handling: here is what happens with a fresh checkout and no git 
modules checked out  and no environment variables set:
   
   ```
   cargo test -p arrow
   ---- ipc::reader::tests::read_decimal_be_file_should_panic stdout ----
   thread 'ipc::reader::tests::read_decimal_be_file_should_panic' panicked at 
'failed to get arrow data dir: env `ARROW_TEST_DATA` is undefined or has empty 
value, and the pre-defined data dir 
`/private/tmp/arrow/rust/arrow/../../testing/data` not found
   HINT: try running `git submodule update --init`', 
arrow/src/util/test_util.rs:81:21
   ```
   
   Here is an example of what happens when `ARROW_TEST_DATA` is pointing 
somewhere non existent
   
   ```
   ARROW_TEST_DATA=blargh cargo test -p arrow
   ...
   --- ipc::reader::tests::read_decimal_be_file_should_panic stdout ----
   thread 'ipc::reader::tests::read_decimal_be_file_should_panic' panicked at 
'failed to get arrow data dir: the data dir `blargh` defined by env 
ARROW_TEST_DATA not found', arrow/src/util/test_util.rs:81:21
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to