[I] Support writing a table with partition column(s) of type LargeUtf8 [datafusion]

via GitHub Thu, 22 Jan 2026 08:34:24 -0800


paleolimbot opened a new issue, #19939:
URL: https://github.com/apache/datafusion/issues/19939

### Is your feature request related to a problem or challenge?

The freshly released Pandas 3.0 has a new string dtype that defaults to
LargeUtf8 when converted to Arrow. There may be a few things this triggers with
respect to LargeUtf8 view support; however the one that caused a failing test
for us was support for LargeUtf8 when writing Parquet with partitions (
https://github.com/apache/sedona-db/pull/538 ). The error is `it is not yet
supported to write to hive partitions with datatype LargeUtf8`.

### Describe the solution you'd like

I think it would be fairly easy to add a branch here to support it. I'm
happy to do this.

https://github.com/apache/datafusion/blob/9f27e933ae97a6bd90b27728abc0e0f238352835/datafusion/datasource/src/write/demux.rs#L394-L405

### Describe alternatives you've considered

Pandas should also probably consider just sticking with `Utf8` as the
default conversion to arrow (there are probably also other places/libraries
that haven't supported this all the way yet).

### Additional context

_No response_

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Support writing a table with partition column(s) of type LargeUtf8 [datafusion]

Reply via email to