alamb commented on code in PR #18985:
URL: https://github.com/apache/datafusion/pull/18985#discussion_r2585928759


##########
benchmarks/bench.sh:
##########
@@ -597,6 +607,14 @@ data_tpch() {
     fi
 }
 
+# Points to TPCDS data generation instructions
+data_tpcds() {
+    echo ""
+    echo "For TPC-DS data generation, please clone the datafusion-benchmarks 
repository:"

Review Comment:
   I suggest only showing this message when the directory is not present
   
   When I tried it out, I was confused
   ```shell
   andrewlamb@Andrews-MacBook-Pro-3:~/Software/datafusion$ 
./benchmarks/bench.sh data tpcds
   ***************************
   DataFusion Benchmark Runner and Data Generator
   COMMAND: data
   BENCHMARK: tpcds
   DATA_DIR: /Users/andrewlamb/Software/datafusion/benchmarks/data
   CARGO_COMMAND: cargo run --release
   PREFER_HASH_JOIN: true
   ***************************
   
   For TPC-DS data generation, please clone the datafusion-benchmarks 
repository:
     git clone https://github.com/apache/datafusion-benchmarks
   ```
   
   So I did what the script told me
   ```shell
   andrewlamb@Andrews-MacBook-Pro-3:~/Software/datafusion$   git clone 
https://github.com/apache/datafusion-benchmarks
   Cloning into 'datafusion-benchmarks'...
   remote: Enumerating objects: 283, done.
   remote: Counting objects: 100% (35/35), done.
   remote: Compressing objects: 100% (26/26), done.
   remote: Total 283 (delta 18), reused 9 (delta 9), pack-reused 248 (from 3)
   Receiving objects: 100% (283/283), 268.89 MiB | 40.49 MiB/s, done.
   Resolving deltas: 100% (64/64), done.
   
   ```
   
   And then I ran the data command again and got told to get the benchmarking 
scripts again 🤔 
   
   ```shell
   andrewlamb@Andrews-MacBook-Pro-3:~/Software/datafusion$ 
./benchmarks/bench.sh data tpcds
   ***************************
   DataFusion Benchmark Runner and Data Generator
   COMMAND: data
   BENCHMARK: tpcds
   DATA_DIR: /Users/andrewlamb/Software/datafusion/benchmarks/data
   CARGO_COMMAND: cargo run --release
   PREFER_HASH_JOIN: true
   ***************************
   
   For TPC-DS data generation, please clone the datafusion-benchmarks 
repository:
     git clone https://github.com/apache/datafusion-benchmarks
   
   andrewlamb@Andrews-MacBook-Pro-3:~/Software/datafusion$
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to