Jonathan Vexler created HUDI-8037:
-------------------------------------
Summary: Partition query for transformed value incorrectly prunes
valid partitions
Key: HUDI-8037
URL: https://issues.apache.org/jira/browse/HUDI-8037
Project: Apache Hudi
Issue Type: Bug
Components: spark, spark-sql
Reporter: Jonathan Vexler
Assignee: Jonathan Vexler
Fix For: 0.15.1
With timestamp keygen you can have a partition column with timestamps, but then
use the keygen so it will create partitions based on days so that all records
that have a timestamp on 7-31-2024 will go to the same parititon even though
the values in the partition column differ by hours and minutes etc.
This causes a problem with partition pruning. lets say you query "select * from
table where partition < 7-31-2024 at 7am and partition > 7-31-2024 at 6am ".
Since the file structure has the partition of just 7-31-2024, that will be
interpreted as 7-31-2024 at 12am. So the partition will be pruned from the
search space.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)