[GitHub] [arrow-cookbook] amol- commented on a change in pull request #75: ARROW-13725: Unify schemas recipe

GitBox Mon, 20 Sep 2021 02:29:39 -0700


amol- commented on a change in pull request #75:
URL: https://github.com/apache/arrow-cookbook/pull/75#discussion_r712008177




##########
File path: python/source/schema.rst
##########
@@ -108,4 +108,87 @@ as far as they are compatible
     pyarrow.Table
     col1: int32
     col2: string
-    col3: double
\ No newline at end of file
+    col3: double
+
+Merging multiple schemas
+========================
+
+When you have multiple separate groups of data that you want to combine
+it might be necessary to unify their schemas to create a superset of them
+that applies to all data sources.
+
+.. testcode::
+
+    import pyarrow as pa
+
+    first_schema = pa.schema([
+        ("country", pa.string()),
+        ("population", pa.int32())
+    ])
+
+    second_schema = pa.schema([
+        ("country_code", pa.string()),
+        ("language", pa.string())
+    ])
+
+:func:`unify_schemas` can be used to combine multiple schemas into
+a single one:
+
+.. testcode::
+
+    union_schema = pa.unify_schemas([first_schema, second_schema])
+
+    print(union_schema)
+
+.. testoutput::
+
+    country: string
+    population: int32
+    country_code: string
+    language: string
+
+If the combined schemas have overlapping columns, they can still be combined
+as far as the colliding columns retain the same type (``country_code``):

Review comment:
       I actually went in the direction of showing the case where there are no 
overlapping columns first because I thought that when talking about "merging" 
two things it's easier to get what's going on if you see a perfect union of the 
two original collections.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-cookbook] amol- commented on a change in pull request #75: ARROW-13725: Unify schemas recipe

Reply via email to