[
https://issues.apache.org/jira/browse/HUDI-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sagar Sumit closed HUDI-7829.
-----------------------------
Resolution: Fixed
> storage partition stats index can not effert in data skipping
> -------------------------------------------------------------
>
> Key: HUDI-7829
> URL: https://issues.apache.org/jira/browse/HUDI-7829
> Project: Apache Hudi
> Issue Type: Bug
> Components: spark-sql
> Reporter: KnightChess
> Assignee: Sagar Sumit
> Priority: Major
> Fix For: 1.0.0
>
> Attachments: image-2024-06-05-16-30-50-503.png,
> image-2024-06-05-16-31-44-871.png, image-2024-06-05-16-32-02-293.png
>
>
> partition stats will not effort, the current implementation does not seem to
> achieve the effect of partition filtering.
> - first
> in this picture, I change the ut filter to trigger partition stats index.
> !image-2024-06-05-16-30-50-503.png!
> partition_stats will not save fileName, so if reuse `CSI` logical, it will
> throw null point in group by key
> !image-2024-06-05-16-31-44-871.png!
> !image-2024-06-05-16-32-02-293.png!
> and this will cause skip other index
> * second
> and have a question, I am not sure this pr is use to `partition` purge like
> physical partition col, which mean use other field min/max to get which
> physical partitions to list fileSlice. or filter fileName like `CSI`, `RLI`.
> thanks
--
This message was sent by Atlassian Jira
(v8.20.10#820010)