[
https://issues.apache.org/jira/browse/HBASE-21520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720002#comment-16720002
]
Zheng Hu commented on HBASE-21520:
----------------------------------
Because we have 10 (NUM_FLUSHES=10) hfiles here, and the table will put ~1000
cells ( rows=20, ts=6, qualifiers=8, total=20*6*8 ~ 1000) . Each full table
scan will check the ROWCOL bloom filter 20 (rows)* 8 (column) * 10 (hfiles)=
1600 times. we consider the avg full table scan cost 50ms , then each bloom
filter calculation cost 50 (ms)/ 1600.0 = 0.031 ms ...
> TestMultiColumnScanner cost long time when using ROWCOL bloom type
> ------------------------------------------------------------------
>
> Key: HBASE-21520
> URL: https://issues.apache.org/jira/browse/HBASE-21520
> Project: HBase
> Issue Type: Bug
> Components: test
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
> Attachments: HBASE-21520.v1.patch, TestMultiColumnScanner.png,
> rowcol.txt
>
>
> The TestMultiColumnScanner is easy to be timeout, you can see HBASE-21517.
> In my localhost, when I set the parameters to be {
> Compression.Algorithm.NONE, BloomType.ROW, false }, it took about 5 seconds.
> but if I set the parameters to be { Compression.Algorithm.NONE,
> BloomType.ROWCOL, false }, it would take about 45 seconds, which means
> ROWCOL cost much more time than ROW.
> Need to find out what's wrong with this ut.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)