Hi xuchuanyin, 1. There is no need to maintain separate bloom configurations for task level bloom as we use same configuration (size and fpp) provided by user. We just create task level bloom with the same configuration along with blocklet bloom.
2. Size of bloom is much smaller compared to blocklet level bloom, but yes if data or tasks increases it will also increase over the time. But still, we can use it in driver lru cache as we may not query all the data all time so it keeps only most recently used data only. And also we can skip driver side bloom pruning and do only at executor side if the bloom is very large. Yes, we can maintain bloom at carbondata footer level like parquet/orc but we will lose the datamap framework features like lazy datamap loading or creating. Instead, we can maintain bloom in separate files but maintain the footer to the file as mentioned in my earlier mail. Regards, Ravindra. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
