thisisnic commented on a change in pull request #76:
URL: https://github.com/apache/arrow-cookbook/pull/76#discussion_r719367995
##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
.. testoutput::
0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.
+
+If we have the list of oscar nominations divided in two different tables:
Review comment:
```suggestion
If we have the list of Oscar nominations divided between two different
tables:
```
##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
.. testoutput::
0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.
+
+If we have the list of oscar nominations divided in two different tables:
+
+.. testcode::
+
+ import pyarrow as pa
+
+ oscar_nominations_1 = pa.table([
+ ["Meryl Streep", "Katharine Hepburn"],
+ [21, 12]
+ ], names=["actor", "nominations"])
+
+ oscar_nominations_2 = pa.table([
+ ["Jack Nicholson", "Bette Davis"],
+ [12, 10]
+ ], names=["actor", "nominations"])
+
+We can join them into a single one using :func:`pyarrow.concat_tables`:
Review comment:
Swapped "join" for "combine" to make it clear we're not talking about
SQL-style joins
```suggestion
We can combine them into a single table using :func:`pyarrow.concat_tables`:
```
##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
.. testoutput::
0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.
+
+If we have the list of oscar nominations divided in two different tables:
+
+.. testcode::
+
+ import pyarrow as pa
+
+ oscar_nominations_1 = pa.table([
+ ["Meryl Streep", "Katharine Hepburn"],
+ [21, 12]
+ ], names=["actor", "nominations"])
+
+ oscar_nominations_2 = pa.table([
+ ["Jack Nicholson", "Bette Davis"],
+ [12, 10]
+ ], names=["actor", "nominations"])
+
+We can join them into a single one using :func:`pyarrow.concat_tables`:
+
+.. testcode::
+
+ oscar_nominations = pa.concat_tables([oscar_nominations_1,
+ oscar_nominations_2])
+
+ print(oscar_nominations.to_pydict())
+
+.. testoutput::
+
+ {'actor': ['Meryl Streep', 'Katharine Hepburn', 'Jack Nicholson', 'Bette
Davis'], 'nominations': [21, 12, 12, 10]}
+
+.. note::
+
+ By default appending two tables is a zero-copy operation, that doesn't need
to
+ copy or rewrite data. As tables are made of :class:`pyarrow.ChunkedArray`
+ the result will be a table with multiple chunks, each pointing to the
original
+ data that has been appended. Under some conditions, Arrow might have to
+ do casts (if `promote=True`) and in such cases the data will need to be
copied
+ and an extra cost will occurr.
Review comment:
```suggestion
By default, appending two tables is a zero-copy operation that doesn't
need to
copy or rewrite data. As tables are made of :class:`pyarrow.ChunkedArray`,
the result will be a table with multiple chunks, each pointing to the
original
data that has been appended. Under some conditions, Arrow might have to
cast data from one type to another (if `promote=True`). In such cases the
data
will need to be copied and an extra cost will occur.
```
##########
File path: python/source/data.rst
##########
@@ -137,3 +137,48 @@ function
.. testoutput::
0 .. 198
+
+Appending tables to an existing table
+=====================================
+
+If you have data split in two different tables, it's possible
+to combine them into a single table that is the concatenation
+of the original tables.
Review comment:
Added explicit mention of rows to make sure it's clear this isnt' about
columns.
```suggestion
If you have data split across two different tables, it is possible
to concatenate their rows into a single table.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]