westonpace commented on a change in pull request #66:
URL: https://github.com/apache/arrow-cookbook/pull/66#discussion_r703733626



##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
     :func:`pyarrow.array` for conversion to Arrow arrays,
     and will benefit from zero copy behaviour when possible.
 
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way
+Arrow represents batches of data, they can be seen as a slice
+of a table.
+
+.. testcode::
+
+    import pyarrow as pa
+
+    batch = pa.RecordBatch.from_arrays([
+        pa.array([1, 2, 3, 4, 5]),
+        pa.array([10, 20, 30, 40, 50])
+    ], names=["first", "second"])

Review comment:
       Minor nit: Since we are introducing the concept of record batches here 
it may be clearer to the user if the column names and values are meaningful.  
As it stands it might not be immediately obvious that `first` corresponds to 
`[1, 2, 3, 4, 5]` and `second` corresponds to `[10, 20, 30, 40, 50]`.  However, 
if the example were real world data it may be more clear.

##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
     :func:`pyarrow.array` for conversion to Arrow arrays,
     and will benefit from zero copy behaviour when possible.
 
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way
+Arrow represents batches of data, they can be seen as a slice
+of a table.
+
+.. testcode::
+
+    import pyarrow as pa
+
+    batch = pa.RecordBatch.from_arrays([
+        pa.array([1, 2, 3, 4, 5]),
+        pa.array([10, 20, 30, 40, 50])
+    ], names=["first", "second"])
+
+multiple batches can be combined into a table using 

Review comment:
       ```suggestion
   Multiple batches can be combined into a table using 
   ```

##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
     :func:`pyarrow.array` for conversion to Arrow arrays,
     and will benefit from zero copy behaviour when possible.
 
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data

Review comment:
       ```suggestion
   Most I/O operations in Arrow happen when shipping batches of data
   ```

##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
     :func:`pyarrow.array` for conversion to Arrow arrays,
     and will benefit from zero copy behaviour when possible.
 
+Creating RecordBatches

Review comment:
       Nit: Given the titles are prose I think it might be more correct to 
separate the words.
   ```suggestion
   Creating Record Batches
   ```

##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
     :func:`pyarrow.array` for conversion to Arrow arrays,
     and will benefit from zero copy behaviour when possible.
 
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way

Review comment:
       ```suggestion
   to the destination.  :class:`pyarrow.RecordBatch` is the way
   ```

##########
File path: python/source/create.rst
##########
@@ -67,6 +67,58 @@ from a variety of inputs, including plain python objects
     :func:`pyarrow.array` for conversion to Arrow arrays,
     and will benefit from zero copy behaviour when possible.
 
+Creating RecordBatches
+======================
+
+Most IO operations in Arrow happen shipping batches of data
+to the destination, :class:`pyarrow.RecordBatch` are the way
+Arrow represents batches of data, they can be seen as a slice
+of a table.
+
+.. testcode::
+
+    import pyarrow as pa
+
+    batch = pa.RecordBatch.from_arrays([
+        pa.array([1, 2, 3, 4, 5]),
+        pa.array([10, 20, 30, 40, 50])
+    ], names=["first", "second"])
+
+multiple batches can be combined into a table using 
+:meth:`pyarrow.Table.from_batches`
+
+.. testcode::
+
+    second_batch = pa.RecordBatch.from_arrays([
+        pa.array([6, 7, 8, 9, 10]),
+        pa.array([60, 70, 80, 90, 100])
+    ], names=["first", "second"])
+
+    table = pa.Table.from_batches([batch, second_batch])
+
+.. testcode::
+
+    print(table)
+
+.. testoutput::
+
+    pyarrow.Table
+    first: int64
+    second: int64
+
+Equally, :class:`pyarrow.Table` can be converted to a set of 

Review comment:
       ```suggestion
   Equally, :class:`pyarrow.Table` can be converted to a list of 
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to