[GitHub] [arrow] alamb commented on a change in pull request #7946: ARROW-9711: [Rust] Add new benchmark derived from TPC-H

GitBox Thu, 13 Aug 2020 08:09:34 -0700


alamb commented on a change in pull request #7946:
URL: https://github.com/apache/arrow/pull/7946#discussion_r470023745




##########
File path: rust/benchmarks/README.md
##########
@@ -19,12 +19,27 @@
 
 # Apache Arrow Rust Benchmarks
 
-This crate contains benchmarks based on the [New York Taxi and Limousine 
Commission](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page) data 
set.
+This crate contains benchmarks based on popular public data sets and open 
source benchmark suites, making it easy to
+run real-world benchmarks to help with performance and scalability testing and 
for comparing performance with other Arrow
+implementations as well as other query engines.
 
-Currently, only DataFusion benchmarks exist, but the plan is to add benchmarks 
for the arrow, flight, and parquet crates as well.
+Currently, only DataFusion benchmarks exist, but the plan is to add benchmarks 
for the arrow, flight, and parquet 
+crates as well. 
+
+## Benchmark derived from TPC-H
+
+These benchmarks are derived from the [TPC-H](http://www.tpc.org/tpch/) 
benchmark.
+
+```bash
+cargo run --release --bin tpch -- --iterations 3 --path /mnt/tpch/csv --format 
csv --query 1 --batch-size 4096

Review comment:
       I may have missed it, but if you could include instructions / links to 
instructions on how to actually create the TPCH data that would be super helpful




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] alamb commented on a change in pull request #7946: ARROW-9711: [Rust] Add new benchmark derived from TPC-H

Reply via email to