jorisvandenbossche commented on a change in pull request #49:
URL: https://github.com/apache/arrow-cookbook/pull/49#discussion_r698304373
##########
File path: python/source/io.rst
##########
@@ -497,3 +497,39 @@ the parquet file as :class:`ChunkedArray`
pyarrow.Table
col1: int64
ChunkedArray = 0 .. 99
+
+Reading Line Delimited JSON
+===========================
+
+Arrow has builtin support for line-delimited JSON.
+Each line represents a row of data as a JSON object.
+
+Given some data in a file where each line is a JSON object
+containing a row of data:
+
+.. testcode::
+
+ import tempfile
+
+ with tempfile.NamedTemporaryFile(delete=False, mode="w+") as f:
+ f.write('{"a": 1, "b": 2.0, "c": 1}\n')
+ f.write('{"a": 3, "b": 3.0, "c": 2}\n')
+ f.write('{"a": 5, "b": 4.0, "c": 3}\n')
+ f.write('{"a": 7, "b": 5.0, "c": 4}\n')
+
+The content of the file can be read back to a :class:`pyarrow.Table` using
+:func:`pyarrow.json.read_json`
+
+.. testcode::
+
+ import pyarrow.json as pj
Review comment:
Do we use the `pj` abbreviation anywhere else? If not, I would
personally not introduce it, but either use `import pyarrow.json` (with
`pa.json.read_json`), `from pyarrow import json` (with `json.read_json`) or
`from pyarrow.json import read_json`. (I don't know if we are consistent with
one of those patterns for other modules?)
(I know we have `pq` for parquet used widely, and that's probably too late
to change, but I would personally not mimic this for all our modules)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]