This is an automated email from the ASF dual-hosted git repository.

apitrou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-cookbook.git


The following commit(s) were added to refs/heads/main by this push:
     new 574be8e  Recipe to read line delimited json as of ARROW-13708 (#49)
574be8e is described below

commit 574be8ec6344f2361228db6fbf5b2736804da4c6
Author: Alessandro Molina <[email protected]>
AuthorDate: Wed Sep 1 16:37:54 2021 +0200

    Recipe to read line delimited json as of ARROW-13708 (#49)
    
    * Recipe to read json
    
    * rename pj to pa.json
    
    * Add colon
---
 python/source/io.rst | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/python/source/io.rst b/python/source/io.rst
index b5a9c70..2c1fd82 100644
--- a/python/source/io.rst
+++ b/python/source/io.rst
@@ -517,3 +517,40 @@ the parquet file as :class:`ChunkedArray`
     pyarrow.Table
     col1: int64
     ChunkedArray = 0 .. 99
+
+Reading Line Delimited JSON
+===========================
+
+Arrow has builtin support for line-delimited JSON.
+Each line represents a row of data as a JSON object.
+
+Given some data in a file where each line is a JSON object
+containing a row of data:
+
+.. testcode::
+
+    import tempfile
+
+    with tempfile.NamedTemporaryFile(delete=False, mode="w+") as f:
+        f.write('{"a": 1, "b": 2.0, "c": 1}\n')
+        f.write('{"a": 3, "b": 3.0, "c": 2}\n')
+        f.write('{"a": 5, "b": 4.0, "c": 3}\n')
+        f.write('{"a": 7, "b": 5.0, "c": 4}\n')
+
+The content of the file can be read back to a :class:`pyarrow.Table` using
+:func:`pyarrow.json.read_json`:
+
+.. testcode::
+
+    import pyarrow as pa
+    import pyarrow.json
+
+    table = pa.json.read_json(f.name)
+
+.. testcode::
+
+    print(table.to_pydict())
+
+.. testoutput::
+
+    {'a': [1, 3, 5, 7], 'b': [2.0, 3.0, 4.0, 5.0], 'c': [1, 2, 3, 4]}
\ No newline at end of file

Reply via email to