jorisvandenbossche commented on a change in pull request #75:
URL: https://github.com/apache/arrow-cookbook/pull/75#discussion_r709318491



##########
File path: python/source/schema.rst
##########
@@ -108,4 +108,87 @@ as far as they are compatible
     pyarrow.Table
     col1: int32
     col2: string
-    col3: double
\ No newline at end of file
+    col3: double
+
+Merging multiple schemas
+========================
+
+When you have multiple separate groups of data that you want to combine
+it might be necessary to unify their schemas to create a superset of them
+that applies to all data sources.
+
+.. testcode::
+
+    import pyarrow as pa
+
+    first_schema = pa.schema([
+        ("country", pa.string()),
+        ("population", pa.int32())
+    ])
+
+    second_schema = pa.schema([
+        ("country_code", pa.string()),
+        ("language", pa.string())
+    ])
+
+:func:`unify_schemas` can be used to combine multiple schemas into
+a single one:
+
+.. testcode::
+
+    union_schema = pa.unify_schemas([first_schema, second_schema])
+
+    print(union_schema)
+
+.. testoutput::
+
+    country: string
+    population: int32
+    country_code: string
+    language: string
+
+If the combined schemas have overlapping columns, they can still be combined
+as far as the colliding columns retain the same type (``country_code``):

Review comment:
       Small suggestion (but take it or leave it :)): I would personally 
directly show an example where there is a column in common, instead of 
splitting it in two examples first without common column and then with. 
   Having no columns in common at all doesn't seem like the typical use case 
for this, so directly showing the second example can simplify the cookbook 
entry a bit (one example less to show), while you can still explicitly mention 
how the common column is handled.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to