wjones127 commented on code in PR #14355:
URL: https://github.com/apache/arrow/pull/14355#discussion_r1039822143
##########
docs/source/cpp/json.rst:
##########
@@ -66,6 +73,49 @@ A JSON file is read from a :class:`~arrow::io::InputStream`.
}
}
+StreamingReader
+===============
+
+Reads a file incrementally in fixed-size blocks, each yielding a
+:class:`~arrow::RecordBatch`. Each independent JSON object in a block
+is converted to a row in the output batch.
+
+All batches adhere to a consistent :class:`~arrow:Schema`, which is
+derived from the first loaded batch.
Review Comment:
```suggestion
derived from the first loaded batch. Alternatively, an explicit schema
may be passed via :class:`~ParseOptions`.
```
##########
docs/source/cpp/json.rst:
##########
@@ -66,6 +73,49 @@ A JSON file is read from a :class:`~arrow::io::InputStream`.
}
}
+StreamingReader
+===============
+
+Reads a file incrementally in fixed-size blocks, each yielding a
+:class:`~arrow::RecordBatch`. Each independent JSON object in a block
+is converted to a row in the output batch.
+
+All batches adhere to a consistent :class:`~arrow:Schema`, which is
+derived from the first loaded batch.
+
+.. code-block:: cpp
+
+ #include "arrow/json/api.h"
+
+ {
+ // ...
+ auto read_options = arrow::json::ReadOptions::Defaults();
+ auto parse_options = arrow::json::ParseOptions::Defaults();
+
+ std::shared_ptr<arrow::io::InputStream> stream;
+ auto result = arrow::json::StreamingReader::Make(stream,
+ read_options,
+ parse_options);
+ if (!result.ok()) {
+ // Handle instantiation error
+ }
+ std::shared_ptr<arrow::json::StreamingReader> reader = *result;
+
+ std::shared_ptr<arrow::RecordBatch> batch;
+ while (true) {
+ arrow::Status status = reader->ReadNext(&batch);
+
+ if (!status.ok()) {
+ // Handle read/parse error
+ }
+
+ if (batch == nullptr) {
+ // Handle end of file
+ break;
+ }
+ }
Review Comment:
Note: you can also use a for loop to iterate through a RBR:
```suggestion
for (arrow::Result<std::shared_ptr<arrow::RecordBatch>> maybe_batch :
*reader) {
if (!result.ok()) {
// Handle read/parse error
}
batch = *maybe_batch;
// Operate on each batch...
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]