manojkarthick opened a new pull request #9306: URL: https://github.com/apache/arrow/pull/9306
Add an option to print output in JSON format. in the parquet-read binary. Having json output allows for easy analysis using tools like [jq](https://stedolan.github.io/jq/). This PR builds on the changes implemented in https://github.com/apache/arrow/pull/8686 and incorporates the suggestions in that PR. **Changelog** * Update all three binaries `parquet-schema`, `parquet-rowcount` and `parquet-read` to use [clap](https://github.com/clap-rs/clap) for argument parsing * Add `to_json_value()` method to get `serde_json::Value` from `Row` and `Field` structs (Thanks to @jhorstmann for these changes!) * parquet-schema: * Convert verbose argument into `-v/--verbose` flag * parquet-read: * Add a new flag `-j/--json` that prints the file contents in json lines format * The feature is gated under the `json_output` cargo feature * Update documentation and README with instructions for running * The binaries now use version and author information as defined in Cargo.toml Example output: ``` ❯ parquet-read cities.parquet 3 --json {"continent":"Europe","country":{"name":"France","city":["Paris","Nice","Marseilles","Cannes"]}} {"continent":"Europe","country":{"name":"Greece","city":["Athens","Piraeus","Hania","Heraklion","Rethymnon","Fira"]}} {"continent":"North America","country":{"name":"Canada","city":["Toronto","Vancouver","St. John's","Saint John","Montreal","Halifax","Winnipeg","Calgary","Saskatoon","Ottawa","Yellowknife"]}} ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org