alamb commented on a change in pull request #8996:
URL: https://github.com/apache/arrow/pull/8996#discussion_r548523886



##########
File path: rust/parquet/src/util/test_common/file_util.rs
##########
@@ -19,17 +19,8 @@ use std::{env, fs, io::Write, path::PathBuf, str::FromStr};
 
 /// Returns path to the test parquet file in 'data' directory
 pub fn get_test_path(file_name: &str) -> PathBuf {
-    let mut pathbuf = match env::var("PARQUET_TEST_DATA") {
-        Ok(path) => PathBuf::from_str(path.as_str()).unwrap(),
-        Err(_) => {
-            let mut pathbuf = env::current_dir().unwrap();
-            pathbuf.pop();
-            pathbuf.pop();
-            pathbuf
-                
.push(PathBuf::from_str("cpp/submodules/parquet-testing/data").unwrap());
-            pathbuf
-        }
-    };
+    let mut pathbuf =
+        
PathBuf::from_str(&arrow::util::test_util::parquet_test_data()).unwrap();

Review comment:
       @nevi-me  and @mqy  -- I tried to move `parquet_test_data` into 
https://github.com/apache/arrow/blob/master/rust/parquet/src/util/test_common/file_util.rs
 -- however, the code quickly got messy because `parquet::util` is not 
publically exported and thus I can't use functions defined there in places 
(like datafusion) outside the parquet crate. Furthermore, the `test_utils` are 
only compiled in `test` config, but several datafusion examples use the parquet 
test data but they are not compiled in `test` config. 
   
   I can think of several possibilities:
   1. Leave the `parquet_test_data` function in the arrow crate as it is in 
this PR
   2. Make a copy of parquet_test_data in the parquet crate
   3. Make the parquet util module public and export test_util in all 
configurations
   
   Given that this function is used in tests and the other options seem messy 
to me, I suggest number 1 (though perhaps I am being lazy)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to