ldsantos0911 commented on PR #2155:
URL: https://github.com/apache/iceberg-python/pull/2155#issuecomment-3099308188
> I think there are 2 different issues here.
>
> For V3 tables, yes, I think we can map pyarrow's null type to UnknownType
For V1/V2 tables, I think its correct that `visit_pyarrow` should throw for
pyarrow null type. There's not an equivalent iceberg type mapping for pyarrow's
null type.
>
> The example in #2119, the workaround is to make sure that the pyarrow
table does not contain null type. you can do that by explicitly passing the
schema when creating the pyarrow table
>
> ```
> import pyarrow as pa
>
> # table created with the below pyarrow schema
> schema = pa.schema(
> [
> pa.field("col1", pa.string(), nullable=True),
> ]
> )
>
> df = pa.Table.from_pylist(
> [
> {"col1": None}
> ],
> schema=schema
> )
>
> df.schema
> ```
Thank you for that additional context regarding V1/V2; I believe that makes
sense. As far as V3 tables go, it looks like `null` is already being cast to
`UnknownType`. However, as far as I can tell, the missing link is supporting
type promotion to make it so `UnknownType` is evaluated as compatible with
`XType`. Otherwise, as noted above, we see `ValueError: Mismatch in fields`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]