gabotechs commented on PR #19771:
URL: https://github.com/apache/datafusion/pull/19771#issuecomment-3742284656

   Ok, I see what happened here:
   
   - @alamb actually added this line `TPCDS_DIR="${DATA_DIR}/tpcds_sf1"` 
already in 
https://github.com/apache/datafusion/pull/19244/files#diff-1769f5787dc11c8b1f1b48288cdf3c89d25a5b5cbc6be4740bfcc70a6313ba99L694-R688,
 and then doing `./benchmarks/bench.sh data tpcds && ./benchmarks/bench.sh run 
tpcds` started working out of the box.
   - @comphead  reverted this change in 
https://github.com/apache/datafusion/pull/19552/files#diff-1769f5787dc11c8b1f1b48288cdf3c89d25a5b5cbc6be4740bfcc70a6313ba99R686-R687,
 and doing `./benchmarks/bench.sh data tpcds && ./benchmarks/bench.sh run 
tpcds` no longer work out of the box, requiring users to manually set the 
`DATA_DIR` env variable with `export 
DATA_DIR=../../datafusion-benchmarks/tpcds/data/sf1/`, and making Andrew's 
benchmarking bot 
[here](https://github.com/apache/datafusion/pull/19761#issuecomment-3739887255)
   
   IMO it would be nicer if `./benchmarks/bench.sh data tpcds && 
./benchmarks/bench.sh run tpcds` worked out of the box without requiring users 
to set the `DATA_DIR` env in the same way it works for the TPC-H benchmark.
   
   In fact, I'd bet the intention behind this code here 
https://github.com/apache/datafusion/blob/main/benchmarks/bench.sh#L644-L646 is 
that it works that way, as it's explicitly extracting the contents to 
`"${DATA_DIR}/tpcds_sf1"`:
   ```
           echo "Extracting TPC-DS parquet data to ${TPCDS_DIR}..."
           unzip -o -j -d "${TPCDS_DIR}" 
"${DATA_DIR}/datafusion-benchmarks.zip" 
datafusion-benchmarks-main/tpcds/data/sf1/*
           echo "TPC-DS data extracted."
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to