Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2323#discussion_r189791768 --- Diff: docs/datamap/bloomfilter-datamap-guide.md --- @@ -0,0 +1,94 @@ +# CarbonData BloomFilter DataMap (Alpha feature in 1.4.0) + +* [DataMap Management](#datamap-management) +* [BloomFilter Datamap Introduction](#bloomfilter-datamap-introduction) +* [Loading Data](#loading-data) +* [Querying Data](#querying-data) +* [Data Management](#data-management-with-bloomfilter-datamap) + +#### DataMap Management +Creating BloomFilter DataMap + ``` + CREATE DATAMAP [IF NOT EXISTS] datamap_name + ON TABLE main_table + USING 'bloomfilter' + DMPROPERTIES ('index_columns'='city, name', 'BLOOM_SIZE'='640000', 'BLOOM_FPP'='0.00001') + ``` + +Dropping specified datamap + ``` + DROP DATAMAP [IF EXISTS] datamap_name + ON TABLE main_table + ``` + +Showing all DataMaps on this table + ``` + SHOW DATAMAP + ON TABLE main_table + ``` +It will show all DataMaps created on main table. + + +## BloomFilter DataMap Introduction +A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. +Carbondata introduce BloomFilter as an index datamap to enhance the performance of querying with precise value. +Internally, CarbonData maintains a BloomFilter per blocklet for each index column to indicate that whether a value of the column is in this blocklet. --- End diff -- OK
---