alamb commented on code in PR #14225:
URL: https://github.com/apache/datafusion/pull/14225#discussion_r1924394309
##########
benchmarks/bench.sh:
##########
@@ -536,23 +536,52 @@ data_imdb() {
done
if [ "$convert_needed" = true ]; then
- if [ ! -f "${imdb_dir}/imdb.tgz" ]; then
- echo "Downloading IMDB dataset..."
+ # Expected size of the dataset
Review Comment:
I tried running this locally on my mac and it `numfmt` seems not to be
installed. Is there any way to make it work without having to install a new
program?
```shell
andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ rm
benchmarks/data/imdb/*.parquet
andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$
./benchmarks/bench.sh data imdb
***************************
DataFusion Benchmark Runner and Data Generator
COMMAND: data
BENCHMARK: imdb
DATA_DIR: /Users/andrewlamb/Software/datafusion/benchmarks/data
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: true
***************************
Looking for imdb.tgz... found
Checking size... ./benchmarks/bench.sh: line 551: numfmt: command not found
OK ()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]