amol- commented on a change in pull request #1:
URL: https://github.com/apache/arrow-cookbook/pull/1#discussion_r676499072
##########
File path: python/source/io.rst
##########
@@ -0,0 +1,423 @@
+========================
+Reading and Writing Data
+========================
+
+Recipes related to reading and writing data from disk using
+Apache Arrow.
+
+.. contents::
+
+Write a Parquet file
+====================
+
+.. testsetup::
+
+ import numpy as np
+ import pyarrow as pa
+
+ arr = pa.array(np.arange(100))
+
+Given an array with all numbers from 0 to 100
+
+.. testcode::
+
+ print(f"{arr[0]} .. {arr[-1]}")
+
+.. testoutput::
+
+ 0 .. 99
+
+To write it to a Parquet file, as Parquet is a columnar format,
+we must create a :class:`pyarrow.Table` out of it,
+so that we get a table of a single column which can then be
+written to a Parquet file.
+
+.. testcode::
+
+ table = pa.Table.from_arrays([arr], names=["col1"])
+
+Once we have a table, it can be written to a Parquet File
+using the functions provided by the ``pyarrow.parquet`` module
+
+.. testcode::
+
+ import pyarrow.parquet as pq
+
+ pq.write_table(table, "example.parquet", compression=None)
+
+Reading a Parquet file
+======================
+
+Given a Parquet file, it can be read back to a :class:`pyarrow.Table`
+by using :func:`pyarrow.parquet.read_table` function
+
+.. testcode::
+
+ import pyarrow.parquet as pq
+
+ table = pq.read_table("example.parquet")
+
+The resulting table will contain the same columns that existed in
+the parquet file as :class:`ChunkedArray`
+
+.. testcode::
+
+ print(table)
+
+ col1 = table["col1"]
+ print(f"{type(col1).__name__} = {col1[0]} .. {col1[-1]}")
+
+.. testoutput::
+
+ pyarrow.Table
+ col1: int64
+ ChunkedArray = 0 .. 99
Review comment:
I was actually one of the original design goals of the cookbook, while
it's true that it might require more maintenance cost when the repr output
changes, it's a minor effort (you just need to copy the new output into the old
`.. testoutput`) but that will guarantee us that if there is _any_ change in
arrow in terms of behaviours the recipes will always reflect exactly what arrow
is going to do/answer.
We plan to setup a nightly job to verify the cookbook against the latest
arrow master so that we catch any change and can update it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]