[ 
https://issues.apache.org/jira/browse/HUDI-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5587:
----------------------------
    Component/s: metadata

> Improve the generation of metadata records of bloom filters
> -----------------------------------------------------------
>
>                 Key: HUDI-5587
>                 URL: https://issues.apache.org/jira/browse/HUDI-5587
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: metadata
>            Reporter: Ethan Guo
>            Priority: Critical
>
> When updating the metadata table's bloom_filter partition, we read the 
> parquet footers to get the bloom filters before converting them to metadata 
> table records for upsert.  This part can be affected by throttling, so we'll 
> need to generate the bloom filters whenever parquet files are written and 
> flow them into the metadata writer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to