Renkai commented on issue #2444:
URL: https://github.com/apache/arrow-rs/issues/2444#issuecomment-1216760591

   @tustvold Thanks a lot!
   
   I replaced the generator with this one, it basically changed the  
`pa.PyExtensionType`  to `pa.ExtensionType`. The rust parquet parser works well 
now except it would read the data type as `FixedSizeBinary(16)`. I think it's a 
slight difference in behavior from the C++ parser. For practice, I can continue 
my adventure, but would you consider making the community less divergent?
   
   ```
   import pyarrow as pa
   
   
   class UuidType(pa.ExtensionType):
   
       def __init__(self):
           pa.ExtensionType.__init__(self, pa.binary(16),"lance.uuid")
   
       def __arrow_ext_serialize__(self):
           # since we don't have a parameterized type, we don't need extra
           # metadata to be deserialized
           return b''
   
       @classmethod
       def __arrow_ext_deserialize__(self, storage_type, serialized):
           # return an instance of this subclass given the serialized
           # metadata.
           return UuidType()
   
   
   if __name__ == '__main__':
       uuid_type = UuidType()
       print(uuid_type.extension_name)
       print(uuid_type.storage_type)
       import uuid
   
       storage_array = pa.array([uuid.uuid4().bytes for _ in range(4)], 
pa.binary(16))
       arr = pa.ExtensionArray.from_storage(uuid_type, storage_array)
       print(arr)
       table = pa.Table.from_arrays([arr], names=["uuid"])
       import pyarrow.parquet as pq
   
       pq.write_table(table, "extension_example.parquet")
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to