[
https://issues.apache.org/jira/browse/SPARK-57571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk updated SPARK-57571:
-----------------------------
Description:
h2. What
Support pushing down filters on {{TimeType}} columns to the ORC reader,
matching the existing
Parquet TIME filter pushdown (SPARK-51687).
h2. Gap
{{OrcFilters.getPredicateLeafType}}
(sql/core/.../execution/datasources/orc/OrcFilters.scala) handles
Date/Timestamp/TimestampNTZ
but has no {{TimeType}} case, so TIME predicates are not pushed down to ORC.
Native ORC
read/write for TIME is already supported (SPARK-54472).
h2. Scope
* Add a {{TimeType}} branch in OrcFilters (predicate leaf type LONG, value =
nanos-of-day)
plus the corresponding value conversion.
* Push down only the supported comparisons (EQ/LT/LTE/GT/GTE/IN/IsNull).
h2. Acceptance criteria
* Equality/range filters on a TIME column push down to ORC and return correct
results.
* Tests added in OrcFilterSuite (and the native/Hive ORC variants where
applicable).
> Support TIME predicate pushdown to ORC
> --------------------------------------
>
> Key: SPARK-57571
> URL: https://issues.apache.org/jira/browse/SPARK-57571
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Priority: Major
>
> h2. What
> Support pushing down filters on {{TimeType}} columns to the ORC reader,
> matching the existing
> Parquet TIME filter pushdown (SPARK-51687).
> h2. Gap
> {{OrcFilters.getPredicateLeafType}}
> (sql/core/.../execution/datasources/orc/OrcFilters.scala) handles
> Date/Timestamp/TimestampNTZ
> but has no {{TimeType}} case, so TIME predicates are not pushed down to ORC.
> Native ORC
> read/write for TIME is already supported (SPARK-54472).
> h2. Scope
> * Add a {{TimeType}} branch in OrcFilters (predicate leaf type LONG, value =
> nanos-of-day)
> plus the corresponding value conversion.
> * Push down only the supported comparisons (EQ/LT/LTE/GT/GTE/IN/IsNull).
> h2. Acceptance criteria
> * Equality/range filters on a TIME column push down to ORC and return correct
> results.
> * Tests added in OrcFilterSuite (and the native/Hive ORC variants where
> applicable).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]