Max Burke created ARROW-11324:
---------------------------------
Summary: [Rust] Querying datetime data in DataFusion with an
embedded timezone always fails
Key: ARROW-11324
URL: https://issues.apache.org/jira/browse/ARROW-11324
Project: Apache Arrow
Issue Type: Bug
Components: Rust - DataFusion
Reporter: Max Burke
We have a number (~ hundreds of thousands) of Parquet files that have embedded
Arrow schemas in them that have time-valued columns with the type
DateTime(TimeUnit::Nanosecond, Some("UTC")).
One of the changes in the Arrow 2 -> 3 working window was to make the Parquet
loader prefer the Arrow schema compared to the one generated from the columns.
But because DataFusion has the timezone field of the DateTime variant hardcoded
as None, we can't load any of our data after this upgrade; we get errors like:
{{SELECT * FROM parquet_table WHERE ("timestamp" >=
to_timestamp('2010-03-24T13:00:00.000000Z') AND "timestamp" <=
to_timestamp('2010-03-25T00:00:00.000000Z')) ORDER BY timestamp ASC NULLS
LAST;}}
{{Plan("\'Timestamp(Nanosecond, Some(\"UTC\")) >= Timestamp(Nanosecond, None)\'
can\'t be evaluated because there isn\'t a common type to coerce the types
to")}}
Any ideas/thoughts?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)