kevinjqliu commented on issue #2372: URL: https://github.com/apache/iceberg-python/issues/2372#issuecomment-3215637994
looks like we changed pyiceberg's internal uuid representation in https://github.com/apache/iceberg-python/pull/2007/files#diff-8d5e63f2a87ead8cebe2fd8ac5dcf2198d229f01e16bb9e06e21f7277c328abdR687 from `pa.binary(16)` to `pa.uuid()` this is in the call path for scanning. `pyarrow_filter` -> [`expression_to_pyarrow`](https://github.com/apache/iceberg-python/blob/19efd2d1d5f1802f33c3db77eaf426323b28df97/pyiceberg/io/pyarrow.py#L1514) -> [`ConvertToArrowExpression.visit_equal`](https://github.com/apache/iceberg-python/blob/main/pyiceberg/io/pyarrow.py#L818-L819) -> [`_convert_scalar`](https://github.com/apache/iceberg-python/blob/19efd2d1d5f1802f33c3db77eaf426323b28df97/pyiceberg/io/pyarrow.py#L789-L792) -> [`schema_to_pyarrow`](https://github.com/apache/iceberg-python/blob/19efd2d1d5f1802f33c3db77eaf426323b28df97/pyiceberg/io/pyarrow.py#L688) -> [`_ConvertToArrowSchema.visit_uuid` ](https://github.com/apache/iceberg-python/blob/19efd2d1d5f1802f33c3db77eaf426323b28df97/pyiceberg/io/pyarrow.py#L779) there are other unsupported functions in pyarrow for uuid. for example, https://github.com/apache/arrow/issues/47094 and it is called out in #2007 that uuid extension type is not fully supported yet. maybe we can revert back to using `pa.binary(16)` as the internal representation. @Fokko wdyt? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
