[I] [Python] Expose new FLOAT16 logical type in the pyarrow.parquet bindings [arrow]

via GitHub Thu, 06 Jun 2024 07:57:26 -0700


jorisvandenbossche opened a new issue, #42016:
URL: https://github.com/apache/arrow/issues/42016


   Reading and writing data with a float16 field works just fine (because its 
implemented on the C++ side):
   
   ```python
   >>> table = pa.table({"a": np.array([0.1, 0.2], "float16"), "b": 
np.array([1, 2], "int8")})
   >>> pq.write_table(table, "/tmp/test_float16.parquet")
   >>> meta = pq.read_metadata("/tmp/test_float16.parquet")
   >>> meta.schema
   <pyarrow._parquet.ParquetSchema object at 0x7f488ec55000>
   required group field_id=-1 schema {
     optional fixed_len_byte_array(2) field_id=-1 a (Float16);
     optional int32 field_id=-1 b (Int(bitWidth=8, isSigned=true));
   }
   ```
   
   But in a few parts of the API you can see we didn't add it to the python 
bindings:
   
   ```python
   >>> meta.schema.column(0).logical_type
   <pyarrow._parquet.ParquetLogicalType object at 0x7f488ef60210>
     Float16
   >>> meta.schema.column(1).logical_type
   <pyarrow._parquet.ParquetLogicalType object at 0x7f4894c9fcf0>
     Int(bitWidth=8, isSigned=true)
   
   >>> meta.schema.column(0).logical_type.type
   'UNKNOWN'                                          # <--- UNKNOWN instead of 
FLOAT16 here
   >>> meta.schema.column(1).logical_type.type
   'INT'
   ```
   
   That comes from 
   
   
https://github.com/apache/arrow/blob/374b8f6ddec3b7614408ea874ffb29981c2a295d/python/pyarrow/_parquet.pyx#L1290-L1306
   
   (it might actually be the only place to add it)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Python] Expose new FLOAT16 logical type in the pyarrow.parquet bindings [arrow]

Reply via email to