Guthman opened a new issue, #47173: URL: https://github.com/apache/arrow/issues/47173
### Describe the enhancement requested Arrow currently supports JSON reading in C++, Python, Java, etc. But it currently lacks any equivalent of a JSON writer. While Rust (arrow_json::writer) and Go (arrjson) implement their own serialization, they do not leverage the shared C++ core. This results in these limitations: - Python users (e.g., BigQuery→Arrow→Postgres JSONB, which is my particular use case) must fall back to slow, Python-level loops or fallback to orjson, missing C++‑level performance. - No feature parity with Rust and Go, which already provide fast JSON serialization. - Large-scale pipelines suffer from marshalling overhead and poor scaling. ## Proposal Overview ### 1. **C++ Core: Add JSON Writer API** - Mirror the existing `arrow::json::TableReader` with a new `arrow::json::TableWriter` or `RecordBatchWriter`. - Support both output formats: - **NDJSON** (newline-delimited) - **JSON array** - Configurable via builder-pattern options: - Include or omit nulls - Binary types encoding (e.g., Base64) - Formatting (pretty, flat) ### 2. **Bindings** - **PyArrow**: add `pyarrow.json.write_json(table_or_batch, sink=None, ndjson=False, include_nulls=True, binary_encoding="base64")`, wrapping the new C++ API. - **Arrow Java**: introduce a corresponding `JsonWriter` class to maintain cross-language feature consistency. ### 3. **Functionality & Performance** - Full support for Arrow types: scalars, nested structs/lists, binary, timestamps, dictionaries, nulls. - Streaming output row-by-row to avoid in-memory buffering. - Benchmark target: achieve near-native performance, comparable to Rust’s `LineDelimitedWriter`. *I had this request edited by an LLM, as I'm not very familiar with the Arrow backend architecture. I checked all the claims, but some inaccuracies might has slipped through, if so, sorry. ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org