[
https://issues.apache.org/jira/browse/HUDI-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Geser Dugarov updated HUDI-7952:
--------------------------------
Description:
Fix of ClassCastException in https://issues.apache.org/jira/browse/HUDI-7709
with nulls as partition columns values could lead to an empty query results.
HoodieFileIndex.listFiles() would return Seq of
{color:#000000}PartitionDirectory with null values.{color}
{color:#000000}But there is another problem with range filters on partition
column.{color}
{color:#000000}For instance, we have UNIX_TIMESTAMP in column ts.{color}
And the table is also partitioned by ts with
hoodie.keygen.timebased.output.dateformat = "yyyy-MM-dd HH"
{color:#000000}For execution of query like:{color}
SELECT ... WHERE ts BETWEEN 1078016000 and 1718953003 ...
it's not possible to filter rows properly.
was:
Fix of ClassCastException in https://issues.apache.org/jira/browse/HUDI-7709
with nulls as partition columns values could lead to an empty query results.
HoodieFileIndex.listFiles() would return Seq of
{color:#000000}PartitionDirectory with null values.{color}
{color:#000000}But there is another problem with partition range filters.{color}
{color:#000000}For instance, for UNIX_TIMESTAMP, column ts, we set:{color}
SELECT ... WHERE ts BETWEEN 1078016000 and 1718953003 ...
And the table is also partitioned by ts with
hoodie.keygen.timebased.output.dateformat = "yyyy-MM-dd HH"
> Incorrect partition pruning when TimestampBasedKeyGenerator is used
> -------------------------------------------------------------------
>
> Key: HUDI-7952
> URL: https://issues.apache.org/jira/browse/HUDI-7952
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Geser Dugarov
> Assignee: Geser Dugarov
> Priority: Major
>
> Fix of ClassCastException in https://issues.apache.org/jira/browse/HUDI-7709
> with nulls as partition columns values could lead to an empty query results.
> HoodieFileIndex.listFiles() would return Seq of
> {color:#000000}PartitionDirectory with null values.{color}
>
> {color:#000000}But there is another problem with range filters on partition
> column.{color}
> {color:#000000}For instance, we have UNIX_TIMESTAMP in column ts.{color}
> And the table is also partitioned by ts with
> hoodie.keygen.timebased.output.dateformat = "yyyy-MM-dd HH"
> {color:#000000}For execution of query like:{color}
> SELECT ... WHERE ts BETWEEN 1078016000 and 1718953003 ...
> it's not possible to filter rows properly.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)