Vinoth Chandar created HUDI-351:
-----------------------------------
Summary: Implement Range + Bloom Filter checking in one go to
improve speed of index
Key: HUDI-351
URL: https://issues.apache.org/jira/browse/HUDI-351
Project: Apache Hudi (incubating)
Issue Type: New Feature
Components: Index, Performance
Reporter: Vinoth Chandar
Currently, we read the min/max ranges once for range pruning and again read the
footer metadata to check for bloom filter..
Once spark 2.4 support and the 2GB limitations are gone.. worth revisiting if
we could do this in a single pass for cases where the bloom filters could fit
into memory or implement this check as a RDD operation..
--
This message was sent by Atlassian Jira
(v8.3.4#803005)