[ 
https://issues.apache.org/jira/browse/HUDI-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit closed HUDI-7829.
-----------------------------
    Resolution: Fixed

> storage partition stats index can not effert in data skipping
> -------------------------------------------------------------
>
>                 Key: HUDI-7829
>                 URL: https://issues.apache.org/jira/browse/HUDI-7829
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: spark-sql
>            Reporter: KnightChess
>            Assignee: Sagar Sumit
>            Priority: Major
>             Fix For: 1.0.0
>
>         Attachments: image-2024-06-05-16-30-50-503.png, 
> image-2024-06-05-16-31-44-871.png, image-2024-06-05-16-32-02-293.png
>
>
> partition stats will not effort, the current implementation does not seem to 
> achieve the effect of partition filtering.
> - first
> in this picture, I change the ut filter to trigger partition stats index.
> !image-2024-06-05-16-30-50-503.png!
> partition_stats will not save fileName, so if reuse `CSI` logical, it will 
> throw null point in group by key
> !image-2024-06-05-16-31-44-871.png!
> !image-2024-06-05-16-32-02-293.png!
> and this will cause skip other index
>  * second
> and have a question, I am not sure this pr is use to `partition` purge like 
> physical partition col, which mean use other field min/max to get which 
> physical partitions to list fileSlice. or filter fileName like `CSI`, `RLI`.
> thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to