alamb commented on a change in pull request #8705:
URL: https://github.com/apache/arrow/pull/8705#discussion_r526441242
##########
File path: rust/benchmarks/README.md
##########
@@ -49,45 +49,16 @@ data. This value can be increased to generate larger data
sets.
The benchmark can then be run (assuming the data created from `dbgen` is in
`/mnt/tpch-dbgen`) with a command such as:
```bash
-cargo run --release --bin tpch -- --iterations 3 --path /mnt/tpch-dbgen
--format tbl --query 1 --batch-size 4096
+cargo run --release --bin tpch -- benchmark --iterations 3 --path
/mnt/tpch-dbgen --format tbl --query 1 --batch-size 4096
```
-The benchmark program also supports CSV and Parquet input file formats.
-
-This crate does not currently provide a method for converting the generated
tbl format to CSV or Parquet so it is
-necessary to use other tools to perform this conversion.
-
-One option is to use the following Docker image to perform the conversion from
`tbl` files to CSV or Parquet.
-
-```bash
-docker run -it ballistacompute/spark-benchmarks:0.4.0-SNAPSHOT
- -h, --help Show help message
-
-Subcommand: convert-tpch
- -i, --input <arg>
- --input-format <arg>
- -o, --output <arg>
- --output-format <arg>
- -p, --partitions <arg>
Review comment:
FWIW the Rust version doesn't seem to have any option to create
partitions, which is fine for the first version. However, it might be worth it
to leave these instructions in until we have added the `-p` option to the Rust
creator.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]