[ https://issues.apache.org/jira/browse/HBASE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624654#comment-16624654 ]
Hudson commented on HBASE-20636: -------------------------------- Results for branch master [build #505 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/505/]: (x) *{color:red}-1 overall{color}* ---- details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/505//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/505//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/505//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED > ------------------------------------------------------------------- > > Key: HBASE-20636 > URL: https://issues.apache.org/jira/browse/HBASE-20636 > Project: HBase > Issue Type: New Feature > Components: HFile, regionserver, scan > Reporter: Guangxu Cheng > Assignee: Guangxu Cheng > Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-20636.master.001.patch, > HBASE-20636.master.002.patch, HBASE-20636.master.003.patch, > HBASE-20636.master.004.patch, HBASE-20636.master.005.patch > > > As we all know, HBase uses BloomFilter(ROW and ROWCOL) to filter unnecessary > files to improve read performance. But they only support Get and do not > support Scan. > In our company(Tencent), many users need to scan all rows with the same > prefix, such as Tencent Game. Game user's some operational record will be > written into HBase, each game user will have a lot of records, the rowkey is > constructed as userid+'#'+timestamps. So we can scan all records for a given > user for a specified period. > For this scenario, we designed the prefix Bloom filter. If the startRow and > stopRow of the Scan has a valid common prefix, the scan will be allowed to > use BloomFilter to filter files which will enhance the performance of the > scan. > Now, this feature has been running on our cluster over a year, and scan > performance for this scenario has been improved by more than one times than > before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)