khwilson commented on code in PR #43849:
URL: https://github.com/apache/arrow/pull/43849#discussion_r1733725737


##########
docs/source/python/extending_types.rst:
##########
@@ -131,58 +131,82 @@ and serialization mechanism. The extension name and 
serialized metadata
 can potentially be recognized by other (non-Python) Arrow implementations
 such as PySpark.
 
-For example, we could define a custom UUID type for 128-bit numbers which can
-be represented as ``FixedSizeBinary`` type with 16 bytes::
+For example, we could define a custom rational type for fractions which can
+be represented as a pair of integers::
 
-    class UuidType(pa.ExtensionType):
+    import pyarrow as pa
+    import pyarrow.types as pt
+
+    class RationalType(pa.ExtensionType):
 
         def __init__(self):
-            super().__init__(pa.binary(16), "my_package.uuid")
 
-        def __arrow_ext_serialize__(self):
-            # Since we don't have a parameterized type, we don't need extra
-            # metadata to be deserialized
-            return b''
+            super().__init__(
+                pa.struct(
+                    [
+                        ("numer", pa.int32()),
+                        ("denom", pa.int32()),
+                    ],
+                ),
+                "my_package.rational",
+            )
+
+        def __arrow_ext_serialize__(self) -> bytes:
+            # No serialized metadata necessary
+            return b""
 
         @classmethod
-        def __arrow_ext_deserialize__(cls, storage_type, serialized):
+        def __arrow_ext_deserialize__(self, storage_type, serialized):
             # Sanity checks, not required but illustrate the method signature.
-            assert storage_type == pa.binary(16)
+            assert pt.is_struct(storage_type)
+            assert pt.is_int32(storage_type[0].type)
             assert serialized == b''
-            # Return an instance of this subclass given the serialized
-            # metadata.
-            return UuidType()
+
+            # return an instance of this subclass given the serialized
+            # metadata
+            return RationalType()
+
 
 The special methods ``__arrow_ext_serialize__`` and 
``__arrow_ext_deserialize__``
-define the serialization of an extension type instance. For non-parametric
-types such as the above, the serialization payload can be left empty.
+define the serialization of an extension type instance.
 
 This can now be used to create arrays and tables holding the extension type::
 
-    >>> uuid_type = UuidType()
-    >>> uuid_type.extension_name
-    'my_package.uuid'
-    >>> uuid_type.storage_type
-    FixedSizeBinaryType(fixed_size_binary[16])
-
-    >>> import uuid
-    >>> storage_array = pa.array([uuid.uuid4().bytes for _ in range(4)], 
pa.binary(16))
-    >>> arr = pa.ExtensionArray.from_storage(uuid_type, storage_array)
+    >>> rational_type = RationalType()
+    >>> rational_type.extension_name
+    'my_package.rational'
+    >>> rational_type.storage_type
+    StructType(struct<numer: int32, denom: int32>)
+
+    >>> storage_array = pa.array(
+    ... [
+    ...     {"numer": 10, "denom": 17},
+    ...     {"numer": 20, "denom": 13},
+    ... ],
+    ... type=rational_type.storage_type

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to