suremarc commented on PR #7801:
URL:
https://github.com/apache/arrow-datafusion/pull/7801#issuecomment-1773144039
@devinjdangelo I attempted to use this feature in `datafusion-cli` today, as
it is useful for something I am doing. I got this error when writing to a
partitioned table:
```
This feature is not implemented: it is not yet supported to write to hive
partitions with datatype Dictionary(UInt16, Utf8)
```
Here is a repro using `datafusion-cli`:
```sql
CREATE EXTERNAL TABLE lz4_raw_compressed_larger
STORED AS PARQUET
PARTITIONED BY (partition)
LOCATION 'data/';
INSERT INTO lz4_raw_compressed_larger VALUES ('non-partition-value',
'partition');
```
Here's a [zip
file](https://github.com/apache/arrow-datafusion/files/13057020/lz4_raw_compressed_larger.zip)
with a single file in it,
`data/partition=A/lz4_raw_compressed_larger.parquet`.
I noticed the unit tests specify the schema explicitly, but I am guessing if
you have DataFusion infer the schema, the partition columns are encoded as
dictionaries. I think this will limit the usefulness of this feature if
partitioned writes don't work with tables whose schemas are inferred.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]