Vinoth Chandar created HUDI-351:
-----------------------------------

             Summary: Implement Range + Bloom Filter checking in one go to 
improve speed of index
                 Key: HUDI-351
                 URL: https://issues.apache.org/jira/browse/HUDI-351
             Project: Apache Hudi (incubating)
          Issue Type: New Feature
          Components: Index, Performance
            Reporter: Vinoth Chandar


Currently, we read the min/max ranges once for range pruning and again read the 
footer metadata to check for bloom filter..

Once spark 2.4 support and the 2GB limitations are gone.. worth revisiting if 
we could do this in a single pass for cases where the bloom filters could fit 
into memory or implement this check as a RDD operation.. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to