[
https://issues.apache.org/jira/browse/HUDI-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-5587:
----------------------------
Description: When updating the metadata table's bloom_filter partition, we
read the parquet footers to get the bloom filters before converting them to
metadata table records for upsert. This part can be affected by throttling, so
we'll need to generate the bloom filters whenever parquet files are written and
flow them into the metadata writer.
> Improve the generation of metadata records of bloom filters
> -----------------------------------------------------------
>
> Key: HUDI-5587
> URL: https://issues.apache.org/jira/browse/HUDI-5587
> Project: Apache Hudi
> Issue Type: Improvement
> Reporter: Ethan Guo
> Priority: Critical
>
> When updating the metadata table's bloom_filter partition, we read the
> parquet footers to get the bloom filters before converting them to metadata
> table records for upsert. This part can be affected by throttling, so we'll
> need to generate the bloom filters whenever parquet files are written and
> flow them into the metadata writer.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)