[
https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370416#comment-14370416
]
Andrew Purtell commented on HBASE-13291:
----------------------------------------
Those flame graphs show on-CPU differences. Would be really interesting to see
where and how long IO worker threads are waiting and how that differs among the
two cases.
I quick look over the nofilter and filterall graphs suggest that the filterall
case removes a lot of RPC work so shifts on-CPU processing from a mix of:
- RPC handling and related data copies
- skipping to desired cells (skipping through blocks in blockcache, and logic
in SQM#match)
to mostly skipping to desired cells. We maybe could optimize this with an index
into the blockcache but the actual bottleneck may lie elsewhere with
queueing/threading issues.
> Lift the scan ceiling
> ---------------------
>
> Key: HBASE-13291
> URL: https://issues.apache.org/jira/browse/HBASE-13291
> Project: HBase
> Issue Type: Improvement
> Components: Scanners
> Affects Versions: 1.0.0
> Reporter: stack
> Assignee: stack
> Attachments: traces.filterall.svg, traces.nofilter.svg
>
>
> Scanning medium sized rows with multiple concurrent scanners exhibits
> interesting 'ceiling' properties. A server runs at about 6.7k ops a second
> using 450% of possible 1600% of CPUs when 4 clients each with 10 threads
> doing scan 1000 rows. If I add '--filterAll' argument (do not return
> results), then we run at 1450% of possible 1600% possible but we do 8k ops a
> second.
> Let me attach flame graphs for two cases. Unfortunately, there is some
> frustrating dark art going on. Let me try figure it... Filing issue in
> meantime to keep score in.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)