thisisnic commented on a change in pull request #76:
URL: https://github.com/apache/arrow-cookbook/pull/76#discussion_r719367995



##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
 .. testoutput::
 
   0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.
+
+If we have the list of oscar nominations divided in two different tables:

Review comment:
       ```suggestion
   If we have the list of Oscar nominations divided between two different 
tables:
   ```

##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
 .. testoutput::
 
   0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.
+
+If we have the list of oscar nominations divided in two different tables:
+
+.. testcode::
+
+  import pyarrow as pa
+
+  oscar_nominations_1 = pa.table([
+    ["Meryl Streep", "Katharine Hepburn"],
+    [21, 12]
+  ], names=["actor", "nominations"])
+
+  oscar_nominations_2 = pa.table([
+    ["Jack Nicholson", "Bette Davis"],
+    [12, 10]
+  ], names=["actor", "nominations"])
+
+We can join them into a single one using :func:`pyarrow.concat_tables`:

Review comment:
       Swapped "join" for "combine" to make it clear we're not talking about 
SQL-style joins
   
   ```suggestion
   We can combine them into a single table using :func:`pyarrow.concat_tables`:
   ```

##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
 .. testoutput::
 
   0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.
+
+If we have the list of oscar nominations divided in two different tables:
+
+.. testcode::
+
+  import pyarrow as pa
+
+  oscar_nominations_1 = pa.table([
+    ["Meryl Streep", "Katharine Hepburn"],
+    [21, 12]
+  ], names=["actor", "nominations"])
+
+  oscar_nominations_2 = pa.table([
+    ["Jack Nicholson", "Bette Davis"],
+    [12, 10]
+  ], names=["actor", "nominations"])
+
+We can join them into a single one using :func:`pyarrow.concat_tables`:
+
+.. testcode::
+
+  oscar_nominations = pa.concat_tables([oscar_nominations_1, 
+                                        oscar_nominations_2])
+
+  print(oscar_nominations.to_pydict())
+
+.. testoutput::
+
+  {'actor': ['Meryl Streep', 'Katharine Hepburn', 'Jack Nicholson', 'Bette 
Davis'], 'nominations': [21, 12, 12, 10]}
+
+.. note::
+
+  By default appending two tables is a zero-copy operation, that doesn't need 
to
+  copy or rewrite data. As tables are made of :class:`pyarrow.ChunkedArray`
+  the result will be a table with multiple chunks, each pointing to the 
original
+  data that has been appended. Under some conditions, Arrow might have to
+  do casts (if `promote=True`) and in such cases the data will need to be 
copied
+  and an extra cost will occurr.

Review comment:
       ```suggestion
     By default, appending two tables is a zero-copy operation that doesn't 
need to
     copy or rewrite data. As tables are made of :class:`pyarrow.ChunkedArray`,
     the result will be a table with multiple chunks, each pointing to the 
original 
     data that has been appended. Under some conditions, Arrow might have to 
     cast data from one type to another (if `promote=True`).  In such cases the 
data 
     will need to be copied and an extra cost will occur.
   ```

##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
 .. testoutput::
 
   0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.

Review comment:
       Added explicit mention of rows to make sure it's clear this isnt' about 
columns.
   
   ```suggestion
   If you have data split across two different tables, it is possible
   to concatenate their rows into a single table.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to