gabotechs commented on PR #19771:
URL: https://github.com/apache/datafusion/pull/19771#issuecomment-3742284656
Ok, I see what happened here:
- @alamb actually added this line `TPCDS_DIR="${DATA_DIR}/tpcds_sf1"`
already in
https://github.com/apache/datafusion/pull/19244/files#diff-1769f5787dc11c8b1f1b48288cdf3c89d25a5b5cbc6be4740bfcc70a6313ba99L694-R688,
and then doing `./benchmarks/bench.sh data tpcds && ./benchmarks/bench.sh run
tpcds` started working out of the box.
- @comphead reverted this change in
https://github.com/apache/datafusion/pull/19552/files#diff-1769f5787dc11c8b1f1b48288cdf3c89d25a5b5cbc6be4740bfcc70a6313ba99R686-R687,
and doing `./benchmarks/bench.sh data tpcds && ./benchmarks/bench.sh run
tpcds` no longer work out of the box, requiring users to manually set the
`DATA_DIR` env variable with `export
DATA_DIR=../../datafusion-benchmarks/tpcds/data/sf1/`, and making Andrew's
benchmarking bot
[here](https://github.com/apache/datafusion/pull/19761#issuecomment-3739887255)
IMO it would be nicer if `./benchmarks/bench.sh data tpcds &&
./benchmarks/bench.sh run tpcds` worked out of the box without requiring users
to set the `DATA_DIR` env in the same way it works for the TPC-H benchmark.
In fact, I'd bet the intention behind this code here
https://github.com/apache/datafusion/blob/main/benchmarks/bench.sh#L644-L646 is
that it works that way, as it's explicitly extracting the contents to
`"${DATA_DIR}/tpcds_sf1"`:
```
echo "Extracting TPC-DS parquet data to ${TPCDS_DIR}..."
unzip -o -j -d "${TPCDS_DIR}"
"${DATA_DIR}/datafusion-benchmarks.zip"
datafusion-benchmarks-main/tpcds/data/sf1/*
echo "TPC-DS data extracted."
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]