westonpace commented on a change in pull request #66:
URL: https://github.com/apache/arrow-cookbook/pull/66#discussion_r703733626
##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
:func:`pyarrow.array` for conversion to Arrow arrays,
and will benefit from zero copy behaviour when possible.
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way
+Arrow represents batches of data, they can be seen as a slice
+of a table.
+
+.. testcode::
+
+ import pyarrow as pa
+
+ batch = pa.RecordBatch.from_arrays([
+ pa.array([1, 2, 3, 4, 5]),
+ pa.array([10, 20, 30, 40, 50])
+ ], names=["first", "second"])
Review comment:
Minor nit: Since we are introducing the concept of record batches here
it may be clearer to the user if the column names and values are meaningful.
As it stands it might not be immediately obvious that `first` corresponds to
`[1, 2, 3, 4, 5]` and `second` corresponds to `[10, 20, 30, 40, 50]`. However,
if the example were real world data it may be more clear.
##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
:func:`pyarrow.array` for conversion to Arrow arrays,
and will benefit from zero copy behaviour when possible.
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way
+Arrow represents batches of data, they can be seen as a slice
+of a table.
+
+.. testcode::
+
+ import pyarrow as pa
+
+ batch = pa.RecordBatch.from_arrays([
+ pa.array([1, 2, 3, 4, 5]),
+ pa.array([10, 20, 30, 40, 50])
+ ], names=["first", "second"])
+
+multiple batches can be combined into a table using
Review comment:
```suggestion
Multiple batches can be combined into a table using
```
##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
:func:`pyarrow.array` for conversion to Arrow arrays,
and will benefit from zero copy behaviour when possible.
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
Review comment:
```suggestion
Most I/O operations in Arrow happen when shipping batches of data
```
##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
:func:`pyarrow.array` for conversion to Arrow arrays,
and will benefit from zero copy behaviour when possible.
+Creating RecordBatches
Review comment:
Nit: Given the titles are prose I think it might be more correct to
separate the words.
```suggestion
Creating Record Batches
```
##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
:func:`pyarrow.array` for conversion to Arrow arrays,
and will benefit from zero copy behaviour when possible.
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way
Review comment:
```suggestion
to the destination. :class:`pyarrow.RecordBatch` is the way
```
##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
:func:`pyarrow.array` for conversion to Arrow arrays,
and will benefit from zero copy behaviour when possible.
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way
+Arrow represents batches of data, they can be seen as a slice
+of a table.
+
+.. testcode::
+
+ import pyarrow as pa
+
+ batch = pa.RecordBatch.from_arrays([
+ pa.array([1, 2, 3, 4, 5]),
+ pa.array([10, 20, 30, 40, 50])
+ ], names=["first", "second"])
+
+multiple batches can be combined into a table using
+:meth:`pyarrow.Table.from_batches`
+
+.. testcode::
+
+ second_batch = pa.RecordBatch.from_arrays([
+ pa.array([6, 7, 8, 9, 10]),
+ pa.array([60, 70, 80, 90, 100])
+ ], names=["first", "second"])
+
+ table = pa.Table.from_batches([batch, second_batch])
+
+.. testcode::
+
+ print(table)
+
+.. testoutput::
+
+ pyarrow.Table
+ first: int64
+ second: int64
+
+Equally, :class:`pyarrow.Table` can be converted to a set of
Review comment:
```suggestion
Equally, :class:`pyarrow.Table` can be converted to a list of
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]