rok commented on code in PR #1:
URL: https://github.com/apache/parquet-benchmark/pull/1#discussion_r1722965463


##########
README.md:
##########
@@ -1 +1,28 @@
-# Apache Parquet Benchmarking
+# Parquet benchmark data
+
+This repository contains Parquet benchmark data. Such data is useful to help
+optimize Parquet implementations but also advance the Parquet format itself.
+
+At this point the community requests donation of Parquet footers and especially
+footers that are large and slow to parse/process. Typically these are footers 
of
+wide schemata: either coming from lots of individual columns and/or deeply 
nested
+structs.
+
+To donate Parquet footers we have built a binary `parquet-dump-footer` as part
+of parquet tools. This utility extracts footers from parquet, scrubs binary 
data
+for privacy reasons and allows to pretty print (`--debug`) the result for
+inspection before submission.
+
+When you are ready to donate a footer please open a PR against this repository
+and add your footer under `footer/<name>.footer`.
+
+Use `parquet-dump-footer --help` for explantion of all the options.
+
+## alternate parquet-dump-footer binary
+
+You can binaries in this repo for different architectures. The binaries are 
built

Review Comment:
   ```suggestion
   You can use binaries in this repo for different architectures. The binaries 
are built
   ```



##########
README.md:
##########
@@ -1 +1,28 @@
-# Apache Parquet Benchmarking
+# Parquet benchmark data
+
+This repository contains Parquet benchmark data. Such data is useful to help
+optimize Parquet implementations but also advance the Parquet format itself.
+
+At this point the community requests donation of Parquet footers and especially
+footers that are large and slow to parse/process. Typically these are footers 
of
+wide schemata: either coming from lots of individual columns and/or deeply 
nested
+structs.
+
+To donate Parquet footers we have built a binary `parquet-dump-footer` as part
+of parquet tools. This utility extracts footers from parquet, scrubs binary 
data
+for privacy reasons and allows to pretty print (`--debug`) the result for
+inspection before submission.
+
+When you are ready to donate a footer please open a PR against this repository
+and add your footer under `footer/<name>.footer`.
+
+Use `parquet-dump-footer --help` for explantion of all the options.
+
+## alternate parquet-dump-footer binary
+
+You can binaries in this repo for different architectures. The binaries are 
built
+using the following cmake configuration.

Review Comment:
   ```suggestion
   from parquet-cpp using the following cmake configuration.
   ```



##########
README.md:
##########
@@ -1 +1,28 @@
-# Apache Parquet Benchmarking
+# Parquet benchmark data
+
+This repository contains Parquet benchmark data. Such data is useful to help
+optimize Parquet implementations but also advance the Parquet format itself.
+
+At this point the community requests donation of Parquet footers and especially
+footers that are large and slow to parse/process. Typically these are footers 
of
+wide schemata: either coming from lots of individual columns and/or deeply 
nested
+structs.
+
+To donate Parquet footers we have built a binary `parquet-dump-footer` as part
+of parquet tools. This utility extracts footers from parquet, scrubs binary 
data
+for privacy reasons and allows to pretty print (`--debug`) the result for
+inspection before submission.
+
+When you are ready to donate a footer please open a PR against this repository
+and add your footer under `footer/<name>.footer`.
+
+Use `parquet-dump-footer --help` for explantion of all the options.
+
+## alternate parquet-dump-footer binary
+
+You can binaries in this repo for different architectures. The binaries are 
built
+using the following cmake configuration.
+
+```sh

Review Comment:
   ```suggestion
   ```sh
   cd arrow/cpp/build
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to